Run a Batch job using Workflows


Batch is a fully managed service that lets you schedule, queue, and execute batch processing workloads on Compute Engine virtual machine (VM) instances. Batch provisions resources and manages capacity on your behalf, allowing your batch workloads to run at scale.

Workflows allows you to execute the services you need in an order that you define described using the Workflows syntax.

In this tutorial, you use the Workflows connector for Batch to schedule and run a Batch job that executes six tasks in parallel on two Compute Engine VMs. Using both Batch and Workflows allows you to combine the advantages they offer and efficiently provision and orchestrate the entire process.

Objectives

In this tutorial you will:

  1. Create an Artifact Registry repository for a Docker container image.
  2. Get the code for the batch processing workload from GitHub: a sample program that generates prime numbers in batches of 10,000.
  3. Build the Docker image for the workload.
  4. Deploy and execute a workflow that does the following:
    1. Creates a Cloud Storage bucket to store the results of the prime number generator.
    2. Schedules and runs a Batch job that runs the Docker container as six tasks in parallel on two Compute Engine VMs.
    3. Optionally deletes the Batch job after it has completed.
  5. Confirm that the results are as expected and that the batches of generated prime numbers are stored in Cloud Storage.

You can run most of the following commands in the Google Cloud console, or run all the commands using the Google Cloud CLI in either your terminal or Cloud Shell.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

Before you begin

Security constraints defined by your organization might prevent you from completing the following steps. For troubleshooting information, see Develop applications in a constrained Google Cloud environment.

Console

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Make sure that billing is enabled for your Google Cloud project. Learn how to check if billing is enabled on a project.

  3. Enable the Artifact Registry, Batch, Cloud Build, Compute Engine, Workflow Executions, and Workflows APIs.

    Enable the APIs

  4. Create a service account for your workflow to use for authentication with other Google Cloud services and grant it the appropriate roles:

    1. In the Google Cloud console, go to the Create service account page.

      Go to Create service account

    2. Select your project.

    3. In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.

      In the Service account description field, enter a description. For example, Service account for tutorial.

    4. Click Create and continue.

    5. In the Select a role list, filter for the following roles to grant to the user-managed service account you created in the previous step:

      • Batch job editor: to edit Batch jobs.
      • Logs Writer: to write logs.
      • Storage Admin: to control Cloud Storage resources.

      For additional roles, click Add another role and add each additional role.

    6. Click Continue.

    7. To finish creating the account, click Done.

  5. Grant the IAM Service Account User role on the default service account to the user-managed service account created in the previous step. After you enabled the Compute Engine API, the default service account is the Compute Engine default service account (PROJECT_NUMBER-compute@developer.gserviceaccount.com), and the permission is typically assigned through the roles/iam.serviceAccountUser role.

    1. On the Service Accounts page, click the email address of the default service account (PROJECT_NUMBER-compute@developer.gserviceaccount.com).

    2. Click the Permissions tab.

    3. Click the Grant access button.

    4. To add a new principal, enter the email address of your service account (SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com).

    5. In the Select a role list, select the Service Accounts > Service Account User role.

    6. Click Save.

gcloud

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  2. Make sure that billing is enabled for your Google Cloud project. Learn how to check if billing is enabled on a project.

  3. Enable the Artifact Registry, Batch, Cloud Build, Compute Engine Workflow Executions, and Workflows APIs.

    gcloud services enable artifactregistry.googleapis.com \
      batch.googleapis.com \
      cloudbuild.googleapis.com \
      compute.googleapis.com \
      workflowexecutions.googleapis.com \
      workflows.googleapis.com
    
  4. Create a service account for your workflow to use for authentication with other Google Cloud services and grant it the appropriate roles.

    1. Create the service account:

      gcloud iam service-accounts create SERVICE_ACCOUNT_NAME
      

      Replace SERVICE_ACCOUNT_NAME with a name for the service account.

    2. Grant roles to the user-managed service account you created in the previous step. Run the following command once for each of the following IAM roles:

      • roles/batch.jobsEditor: to edit Batch jobs.
      • roles/logging.logWriter: to write logs.
      • roles/storage.admin: to control Cloud Storage resources.
      gcloud projects add-iam-policy-binding PROJECT_ID \
        --member=serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com \
        --role=ROLE
      

      Replace the following:

      • PROJECT_ID: the project ID where you created the service account
      • ROLE: the role to grant
  5. Grant the IAM Service Account User role on the default service account to the user-managed service account you created in the previous step. After you enabled the Compute Engine API, the default service account is the Compute Engine default service account (PROJECT_NUMBER-compute@developer.gserviceaccount.com), and the permission is typically assigned through the roles/iam.serviceAccountUser role.

    PROJECT_NUMBER=$(gcloud projects describe PROJECT_ID --format='value(projectNumber)')
    gcloud iam service-accounts add-iam-policy-binding \
      $PROJECT_NUMBER-compute@developer.gserviceaccount.com \
      --member=serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com \
      --role=roles/iam.serviceAccountUser
    

Create an Artifact Registry repository

Create a repository to store your Docker container image.

Console

  1. In the Google Cloud console, go to the Repositories page.

    Go to Repositories

  2. Click Create Repository.

  3. Enter containers as the repository name.

  4. For Format, choose Docker.

  5. For Location Type, choose Region.

  6. In the Region list, select us-central1.

  7. Click Create.

gcloud

Run the following command:

  gcloud artifacts repositories create containers \
    --repository-format=docker \
    --location=us-central1

You have created an Artifact Registry repository named containers in the us-central1 region. For more information about supported regions, see Artifact Registry locations.

Get the code samples

Google Cloud stores the application source code for this tutorial in GitHub. You can clone that repository or download the samples.

  1. Clone the sample app repository to your local machine:

    git clone https://github.com/GoogleCloudPlatform/batch-samples.git
    

    Alternatively, you can download the samples in the main.zip file and extract it.

  2. Change to the directory that contains the sample code:

    cd batch-samples/primegen
    

You now have the source code for the application in your development environment.

Build the Docker image using Cloud Build

The Dockerfile contains the information needed to build a Docker image using Cloud Build. Run the following command to build it:

gcloud builds submit \
  -t us-central1-docker.pkg.dev/PROJECT_ID/containers/primegen-service:v1 PrimeGenService/

Replace PROJECT_ID with your Google Cloud project ID.

When the build is complete, you should see output similar to the following:

DONE
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ID: a54818cc-5d14-467b-bfda-5fc9590af68c
CREATE_TIME: 2022-07-29T01:48:50+00:00
DURATION: 48S
SOURCE: gs://project-name_cloudbuild/source/1659059329.705219-17aee3a424a94679937a7200fab15bcf.tgz
IMAGES: us-central1-docker.pkg.dev/project-name/containers/primegen-service:v1
STATUS: SUCCESS

Using a Dockerfile, you've built a Docker image named primegen-service and pushed the image to an Artifact Registry repository named containers.

Deploy a workflow that schedules and runs a Batch job

The following workflow schedules and runs a Batch job that runs a Docker container as six tasks in parallel on two Compute Engine VMs. The result is the generation of six batches of prime numbers, stored in a Cloud Storage bucket.

Console

  1. In the Google Cloud console, go to the Workflows page.

    Go to Workflows

  2. Click Create.

  3. Enter a name for the new workflow, such as batch-workflow.

  4. In the Region list, select us-central1.

  5. Select the Service account you previously created.

  6. Click Next.

  7. In the workflow editor, enter the following definition for your workflow:

    YAML

    main:
      params: [args]
      steps:
        - init:
            assign:
              - projectId: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
              - region: "us-central1"
              - imageUri: ${region + "-docker.pkg.dev/" + projectId + "/containers/primegen-service:v1"}
              - jobId: ${"job-primegen-" + string(int(sys.now()))}
              - bucket: ${projectId + "-" + jobId}
        - createBucket:
            call: googleapis.storage.v1.buckets.insert
            args:
              query:
                project: ${projectId}
              body:
                name: ${bucket}
        - logCreateBucket:
            call: sys.log
            args:
              data: ${"Created bucket " + bucket}
        - logCreateBatchJob:
            call: sys.log
            args:
              data: ${"Creating and running the batch job " + jobId}
        - createAndRunBatchJob:
            call: googleapis.batch.v1.projects.locations.jobs.create
            args:
                parent: ${"projects/" + projectId + "/locations/" + region}
                jobId: ${jobId}
                body:
                  taskGroups:
                    taskSpec:
                      runnables:
                        - container:
                            imageUri: ${imageUri}
                          environment:
                            variables:
                              BUCKET: ${bucket}
                    # Run 6 tasks on 2 VMs
                    taskCount: 6
                    parallelism: 2
                  logsPolicy:
                    destination: CLOUD_LOGGING
            result: createAndRunBatchJobResponse
        # You can delete the batch job or keep it for debugging
        - logDeleteBatchJob:
            call: sys.log
            args:
              data: ${"Deleting the batch job " + jobId}
        - deleteBatchJob:
            call: googleapis.batch.v1.projects.locations.jobs.delete
            args:
                name: ${"projects/" + projectId + "/locations/" + region + "/jobs/" + jobId}
            result: deleteResult
        - returnResult:
            return:
              jobId: ${jobId}
              bucket: ${bucket}

    JSON

    {
      "main": {
        "params": [
          "args"
        ],
        "steps": [
          {
            "init": {
              "assign": [
                {
                  "projectId": "${sys.get_env(\"GOOGLE_CLOUD_PROJECT_ID\")}"
                },
                {
                  "region": "us-central1"
                },
                {
                  "imageUri": "${region + \"-docker.pkg.dev/\" + projectId + \"/containers/primegen-service:v1\"}"
                },
                {
                  "jobId": "${\"job-primegen-\" + string(int(sys.now()))}"
                },
                {
                  "bucket": "${projectId + \"-\" + jobId}"
                }
              ]
            }
          },
          {
            "createBucket": {
              "call": "googleapis.storage.v1.buckets.insert",
              "args": {
                "query": {
                  "project": "${projectId}"
                },
                "body": {
                  "name": "${bucket}"
                }
              }
            }
          },
          {
            "logCreateBucket": {
              "call": "sys.log",
              "args": {
                "data": "${\"Created bucket \" + bucket}"
              }
            }
          },
          {
            "logCreateBatchJob": {
              "call": "sys.log",
              "args": {
                "data": "${\"Creating and running the batch job \" + jobId}"
              }
            }
          },
          {
            "createAndRunBatchJob": {
              "call": "googleapis.batch.v1.projects.locations.jobs.create",
              "args": {
                "parent": "${\"projects/\" + projectId + \"/locations/\" + region}",
                "jobId": "${jobId}",
                "body": {
                  "taskGroups": {
                    "taskSpec": {
                      "runnables": [
                        {
                          "container": {
                            "imageUri": "${imageUri}"
                          },
                          "environment": {
                            "variables": {
                              "BUCKET": "${bucket}"
                            }
                          }
                        }
                      ]
                    },
                    "taskCount": 6,
                    "parallelism": 2
                  },
                  "logsPolicy": {
                    "destination": "CLOUD_LOGGING"
                  }
                }
              },
              "result": "createAndRunBatchJobResponse"
            }
          },
          {
            "logDeleteBatchJob": {
              "call": "sys.log",
              "args": {
                "data": "${\"Deleting the batch job \" + jobId}"
              }
            }
          },
          {
            "deleteBatchJob": {
              "call": "googleapis.batch.v1.projects.locations.jobs.delete",
              "args": {
                "name": "${\"projects/\" + projectId + \"/locations/\" + region + \"/jobs/\" + jobId}"
              },
              "result": "deleteResult"
            }
          },
          {
            "returnResult": {
              "return": {
                "jobId": "${jobId}",
                "bucket": "${bucket}"
              }
            }
          }
        ]
      }
    }
    
  8. Click Deploy.

gcloud

  1. Create a source code file for your workflow:

    touch batch-workflow.JSON_OR_YAML
    

    Replace JSON_OR_YAML with yaml or json depending on the format of your workflow.

  2. In a text editor, copy the following workflow to your source code file:

    YAML

    main:
      params: [args]
      steps:
        - init:
            assign:
              - projectId: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
              - region: "us-central1"
              - imageUri: ${region + "-docker.pkg.dev/" + projectId + "/containers/primegen-service:v1"}
              - jobId: ${"job-primegen-" + string(int(sys.now()))}
              - bucket: ${projectId + "-" + jobId}
        - createBucket:
            call: googleapis.storage.v1.buckets.insert
            args:
              query:
                project: ${projectId}
              body:
                name: ${bucket}
        - logCreateBucket:
            call: sys.log
            args:
              data: ${"Created bucket " + bucket}
        - logCreateBatchJob:
            call: sys.log
            args:
              data: ${"Creating and running the batch job " + jobId}
        - createAndRunBatchJob:
            call: googleapis.batch.v1.projects.locations.jobs.create
            args:
                parent: ${"projects/" + projectId + "/locations/" + region}
                jobId: ${jobId}
                body:
                  taskGroups:
                    taskSpec:
                      runnables:
                        - container:
                            imageUri: ${imageUri}
                          environment:
                            variables:
                              BUCKET: ${bucket}
                    # Run 6 tasks on 2 VMs
                    taskCount: 6
                    parallelism: 2
                  logsPolicy:
                    destination: CLOUD_LOGGING
            result: createAndRunBatchJobResponse
        # You can delete the batch job or keep it for debugging
        - logDeleteBatchJob:
            call: sys.log
            args:
              data: ${"Deleting the batch job " + jobId}
        - deleteBatchJob:
            call: googleapis.batch.v1.projects.locations.jobs.delete
            args:
                name: ${"projects/" + projectId + "/locations/" + region + "/jobs/" + jobId}
            result: deleteResult
        - returnResult:
            return:
              jobId: ${jobId}
              bucket: ${bucket}

    JSON

    {
      "main": {
        "params": [
          "args"
        ],
        "steps": [
          {
            "init": {
              "assign": [
                {
                  "projectId": "${sys.get_env(\"GOOGLE_CLOUD_PROJECT_ID\")}"
                },
                {
                  "region": "us-central1"
                },
                {
                  "imageUri": "${region + \"-docker.pkg.dev/\" + projectId + \"/containers/primegen-service:v1\"}"
                },
                {
                  "jobId": "${\"job-primegen-\" + string(int(sys.now()))}"
                },
                {
                  "bucket": "${projectId + \"-\" + jobId}"
                }
              ]
            }
          },
          {
            "createBucket": {
              "call": "googleapis.storage.v1.buckets.insert",
              "args": {
                "query": {
                  "project": "${projectId}"
                },
                "body": {
                  "name": "${bucket}"
                }
              }
            }
          },
          {
            "logCreateBucket": {
              "call": "sys.log",
              "args": {
                "data": "${\"Created bucket \" + bucket}"
              }
            }
          },
          {
            "logCreateBatchJob": {
              "call": "sys.log",
              "args": {
                "data": "${\"Creating and running the batch job \" + jobId}"
              }
            }
          },
          {
            "createAndRunBatchJob": {
              "call": "googleapis.batch.v1.projects.locations.jobs.create",
              "args": {
                "parent": "${\"projects/\" + projectId + \"/locations/\" + region}",
                "jobId": "${jobId}",
                "body": {
                  "taskGroups": {
                    "taskSpec": {
                      "runnables": [
                        {
                          "container": {
                            "imageUri": "${imageUri}"
                          },
                          "environment": {
                            "variables": {
                              "BUCKET": "${bucket}"
                            }
                          }
                        }
                      ]
                    },
                    "taskCount": 6,
                    "parallelism": 2
                  },
                  "logsPolicy": {
                    "destination": "CLOUD_LOGGING"
                  }
                }
              },
              "result": "createAndRunBatchJobResponse"
            }
          },
          {
            "logDeleteBatchJob": {
              "call": "sys.log",
              "args": {
                "data": "${\"Deleting the batch job \" + jobId}"
              }
            }
          },
          {
            "deleteBatchJob": {
              "call": "googleapis.batch.v1.projects.locations.jobs.delete",
              "args": {
                "name": "${\"projects/\" + projectId + \"/locations/\" + region + \"/jobs/\" + jobId}"
              },
              "result": "deleteResult"
            }
          },
          {
            "returnResult": {
              "return": {
                "jobId": "${jobId}",
                "bucket": "${bucket}"
              }
            }
          }
        ]
      }
    }
    
  3. Deploy the workflow by entering the following command:

    gcloud workflows deploy batch-workflow \
      --source=batch-workflow.yaml \
      --location=us-central1 \
      --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com
    

    Replace SERVICE_ACCOUNT_NAME with the name of the service account you previously created.

Execute the workflow

Executing a workflow runs the current workflow definition associated with the workflow.

Console

  1. In the Google Cloud console, go to the Workflows page.

    Go to Workflows

  2. On the Workflows page, click the batch-workflow workflow to go to its details page.

  3. On the Workflow details page, click Execute.

  4. Click Execute again.

    The workflow execution should take a few minutes.

  5. View the results of the workflow in the Output pane.

    The results should look similar to the following:

    {
      "bucket": "project-name-job-primegen-TIMESTAMP",
      "jobId": "job-primegen-TIMESTAMP"
    }
    

gcloud

  1. Execute the workflow:

    gcloud workflows run batch-workflow \
      --location=us-central1

    The workflow execution should take a few minutes.

  2. You can check the status of a long running execution.

  3. To get the status of the last completed execution, run the following command:

    gcloud workflows executions describe-last

    The results should be similar to the following:

    name: projects/PROJECT_NUMBER/locations/us-central1/workflows/batch-workflow/executions/EXECUTION_ID
    result: '{"bucket":"project-name-job-primegen-TIMESTAMP","jobId":"job-primegen-TIMESTAMP"}'
    startTime: '2022-07-29T16:08:39.725306421Z'
    state: SUCCEEDED
    status:
      currentSteps:
      - routine: main
        step: returnResult
    workflowRevisionId: 000001-9ba
    

List the objects in the output bucket

You can confirm that the results are as expected by listing the objects in your Cloud Storage output bucket.

Console

  1. In the Google Cloud console, go to the Cloud Storage Buckets page.

    Go to Buckets

  2. In the bucket list, click on the name of the bucket whose contents you want to view.

    The results should be similar to the following, with six files in total, and each listing a batch of 10,000 prime numbers:

    primes-1-10000.txt
    primes-10001-20000.txt
    primes-20001-30000.txt
    primes-30001-40000.txt
    primes-40001-50000.txt
    primes-50001-60000.txt
    

gcloud

  1. Retrieve your output bucket name:

    gsutil ls

    The output is similar to the following:

    gs://PROJECT_ID-job-primegen-TIMESTAMP/

  2. List the objects in your output bucket:

    gsutil ls -r gs://PROJECT_ID-job-primegen-TIMESTAMP/**

    Replace TIMESTAMP with the timestamp returned by the previous command.

    The output should be similar to the following, with six files in total, and each listing a batch of 10,000 prime numbers:

    gs://project-name-job-primegen-TIMESTAMP/primes-1-10000.txt
    gs://project-name-job-primegen-TIMESTAMP/primes-10001-20000.txt
    gs://project-name-job-primegen-TIMESTAMP/primes-20001-30000.txt
    gs://project-name-job-primegen-TIMESTAMP/primes-30001-40000.txt
    gs://project-name-job-primegen-TIMESTAMP/primes-40001-50000.txt
    gs://project-name-job-primegen-TIMESTAMP/primes-50001-60000.txt
    

Clean up

If you created a new project for this tutorial, delete the project. If you used an existing project and wish to keep it without the changes added in this tutorial, delete resources created for the tutorial.

Delete the project

The easiest way to eliminate billing is to delete the project that you created for the tutorial.

To delete the project:

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete resources created in this tutorial

  1. Delete the Batch job:

    1. First retrieve the job name:

      gcloud batch jobs list --location=us-central1
      

      The output should be similar to the following:

      NAME: projects/project-name/locations/us-central1/jobs/job-primegen-TIMESTAMP
      STATE: SUCCEEDED
      

      Where job-primegen-TIMESTAMP is the name of the Batch job.

    2. Delete the job:

      gcloud batch jobs delete BATCH_JOB_NAME --location us-central1
      
  2. Delete the workflow:

    gcloud workflows delete WORKFLOW_NAME
    
  3. Delete the container repository:

    gcloud artifacts repositories delete REPOSITORY_NAME --location=us-central1
    
  4. Cloud Build uses Cloud Storage to store build resources. To delete a Cloud Storage bucket, refer to Delete buckets.

What's next