Execute a Cloud Run job using Workflows

Stay organized with collections Save and categorize content based on your preferences.

Workflows enables you to execute Cloud Run jobs as part of a workflow to perform more complex data processing or orchestrate a system of existing jobs.

This tutorial demonstrates how to use Workflows to execute a Cloud Run job that processes some data in response to an event from Cloud Storage.

Objectives

In this tutorial you will:

  1. Create a Cloud Run job that processes data from an input file in Cloud Storage.
  2. Deploy a workflow that does the following:
    1. Accepts a Cloud Storage event as an argument.
    2. Checks if the Cloud Storage object specified in the event is the input data file used by the Cloud Run job.
    3. If so, uses the Cloud Run Admin API connector to execute the Cloud Run job.
  3. Create an Eventarc trigger that executes the workflow in response to Cloud Storage events in the bucket containing the input data file.
  4. Trigger the workflow by updating the input data file in Cloud Storage.

Costs

This tutorial uses the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

Before you begin

Some of the steps in this document might not work correctly if your organization applies constraints to your Google Cloud environment. In that case, you might not be able to complete tasks like creating public IP addresses or service account keys. If you make a request that returns an error about constraints, see how to Develop applications in a constrained Google Cloud environment.

Console

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  4. Enable the Cloud Build, Cloud Run, Cloud Storage, Eventarc, and Workflows APIs.

    Enable the APIs

  5. Create a service account:

    1. In the Google Cloud console, go to the Create service account page.

      Go to Create service account
    2. Select your project.
    3. In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.

      In the Service account description field, enter a description. For example, Service account for quickstart.

    4. Click Create and continue.
    5. To provide access to your project, grant the following role(s) to your service account: Cloud Run Invoker, Cloud Run Viewer, Logs Writer .

      In the Select a role list, select a role.

      For additional roles, click Add another role and add each additional role.

    6. Click Continue.
    7. Click Done to finish creating the service account.

  6. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  7. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  8. Enable the Cloud Build, Cloud Run, Cloud Storage, Eventarc, and Workflows APIs.

    Enable the APIs

  9. Create a service account:

    1. In the Google Cloud console, go to the Create service account page.

      Go to Create service account
    2. Select your project.
    3. In the Service account name field, enter a name. The Google Cloud console fills in the Service account ID field based on this name.

      In the Service account description field, enter a description. For example, Service account for quickstart.

    4. Click Create and continue.
    5. To provide access to your project, grant the following role(s) to your service account: Cloud Run Invoker, Cloud Run Viewer, Logs Writer .

      In the Select a role list, select a role.

      For additional roles, click Add another role and add each additional role.

    6. Click Continue.
    7. Click Done to finish creating the service account.

gcloud

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. Install the Google Cloud CLI.
  3. To initialize the gcloud CLI, run the following command:

    gcloud init
  4. Create or select a Google Cloud project.

    • Create a Cloud project:

      gcloud projects create PROJECT_ID
    • Select the Cloud project that you created:

      gcloud config set project PROJECT_ID
  5. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  6. Enable the Cloud Build, Cloud Run, Cloud Storage, Eventarc, and Workflows APIs:

    gcloud services enable cloudbuild.googleapis.com run.googleapis.com storage.googleapis.com eventarc.googleapis.com workflows.googleapis.com
  7. Set up authentication:

    1. Create the service account:

      gcloud iam service-accounts create SERVICE_ACCOUNT_NAME

      Replace SERVICE_ACCOUNT_NAME with a name for the service account.

    2. Grant roles to the service account. Run the following command once for each of the following IAM roles: roles/run.invoker, roles/run.viewer, roles/logging.logWriter :

      gcloud projects add-iam-policy-binding PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com" --role=ROLE

      Replace the following:

      • SERVICE_ACCOUNT_NAME: the name of the service account
      • PROJECT_ID: the project ID where you created the service account
      • ROLE: the role to grant
  8. Install the Google Cloud CLI.
  9. To initialize the gcloud CLI, run the following command:

    gcloud init
  10. Create or select a Google Cloud project.

    • Create a Cloud project:

      gcloud projects create PROJECT_ID
    • Select the Cloud project that you created:

      gcloud config set project PROJECT_ID
  11. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  12. Enable the Cloud Build, Cloud Run, Cloud Storage, Eventarc, and Workflows APIs:

    gcloud services enable cloudbuild.googleapis.com run.googleapis.com storage.googleapis.com eventarc.googleapis.com workflows.googleapis.com
  13. Set up authentication:

    1. Create the service account:

      gcloud iam service-accounts create SERVICE_ACCOUNT_NAME

      Replace SERVICE_ACCOUNT_NAME with a name for the service account.

    2. Grant roles to the service account. Run the following command once for each of the following IAM roles: roles/run.invoker, roles/run.viewer, roles/logging.logWriter :

      gcloud projects add-iam-policy-binding PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com" --role=ROLE

      Replace the following:

      • SERVICE_ACCOUNT_NAME: the name of the service account
      • PROJECT_ID: the project ID where you created the service account
      • ROLE: the role to grant

Create a Cloud Run job

This tutorial uses a sample Cloud Run job from GitHub. The job reads data from an input file in Cloud Storage, and performs some arbitrary processing for each line in the file.

  1. Get the sample code by cloning the sample app repository to your local machine:

    git clone https://github.com/GoogleCloudPlatform/jobs-demos.git
    

    Alternatively, you can download the sample as a ZIP file and extract it.

  2. Change to the directory that contains the sample code:

    cd jobs-demos/parallel-processing
    
  3. Create the Cloud Run job by running the included deployment script:

    ./deploy-parallel-job.sh
    

    The deployment script creates a Cloud Storage bucket with the name input-PROJECT_ID, where PROJECT_ID is the ID of your Google Cloud project, and creates an input data file named input_file.txt in the bucket. The script then creates a Cloud Run job named parallel-job and executes it.

  4. Confirm that the Cloud Run job works as expected by viewing the job executions:

    gcloud beta run jobs executions list --job=parallel-job
    

    You should see a successful execution in the output.

Deploy a workflow that executes the Cloud Run job

Define and deploy a workflow that executes the Cloud Run job you just created. A workflow definition is made up of a series of steps described using the Workflows syntax.

Console

  1. In the Google Cloud console, go to the Workflows page:

    Go to Workflows

  2. Click Create.

  3. Enter a name for the new workflow, such as cloud-run-job-workflow.

  4. Choose an appropriate region; for example, us-central1.

  5. In the Service account field, select the service account you created earlier.

    The service account serves as the workflow's identity and is used to execute the Cloud Run job. You should have already granted the Cloud Run Invoker, Cloud Run Viewer, and Logs Writer roles to the service account.

  6. Click Next.

  7. In the workflow editor, enter the following definition for your workflow:

    main:
        params: [event]
        steps:
            - init:
                assign:
                    - project_id: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                    - event_bucket: ${event.data.bucket}
                    - event_object: ${event.data.name}
                    - target_bucket: ${"input-" + project_id}
                    - target_object: input_file.txt
                    - job_name: parallel-job
                    - job_location: us-central1
            - check_input_object:
                switch:
                    - condition: ${(event_bucket == target_bucket) and (event_object == target_object)}
                      next: run_job
                    - condition: true
                      next: end
            - run_job:
                call: googleapis.run.v1.namespaces.jobs.run
                args:
                    name: ${"namespaces/" + project_id + "/jobs/" + job_name}
                    location: ${job_location}
                result: job_execution
            - finish:
                return: ${job_execution}
  8. Click Deploy.

gcloud

  1. Create a source code file for your workflow:

    touch cloud-run-job-workflow.yaml
    
  2. Copy the following workflow definition to your source code file:

    main:
        params: [event]
        steps:
            - init:
                assign:
                    - project_id: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                    - event_bucket: ${event.data.bucket}
                    - event_object: ${event.data.name}
                    - target_bucket: ${"input-" + project_id}
                    - target_object: input_file.txt
                    - job_name: parallel-job
                    - job_location: us-central1
            - check_input_object:
                switch:
                    - condition: ${(event_bucket == target_bucket) and (event_object == target_object)}
                      next: run_job
                    - condition: true
                      next: end
            - run_job:
                call: googleapis.run.v1.namespaces.jobs.run
                args:
                    name: ${"namespaces/" + project_id + "/jobs/" + job_name}
                    location: ${job_location}
                result: job_execution
            - finish:
                return: ${job_execution}
  3. Deploy the workflow by entering the following command:

    gcloud workflows deploy cloud-run-job-workflow \
        --location=us-central1 \
        --source=cloud-run-job-workflow.yaml \
        --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com
    

    Replace the following:

    • SERVICE_ACCOUNT_NAME: The name of the service account you created earlier
    • PROJECT_ID: The ID of your Google Cloud project

    The service account serves as the workflow's identity and is used to execute the Cloud Run job. You should have already granted the Cloud Run Invoker, Cloud Run Viewer, and Logs Writer roles to the service account.

The workflow accepts a Cloud Storage event as an argument and then sets necessary variables in the init step. The subsequent check_input_object step checks if the Cloud Storage object specified in the event is the input data file used by the Cloud Run job. If so, the workflow proceeds to the run_job step, which uses the Cloud Run Admin API connector's googleapis.run.v1.namespaces.jobs.run method to execute the job. The final finish step returns information about the job execution as the result of the workflow.

In the check_input_object step, if the object specified in the event is not the input data file used by the Cloud Run job, the workflow terminates, halting any further processing.

Create an Eventarc trigger for the workflow

To automatically execute the workflow and in turn the Cloud Run job whenever the input data file is updated, create an Eventarc trigger that responds to Cloud Storage events in the bucket containing the input data file.

Console

  1. In the Google Cloud console, go to the Workflows page:

    Go to Workflows

  2. Click the name of your workflow, such as cloud-run-job-workflow.

  3. On the Workflow details page, click Edit.

  4. On the Edit Workflow page, in the Triggers section, click Add new trigger > Eventarc.

    The Eventarc trigger pane opens.

  5. In the Trigger name field, enter a name for the trigger, such as cloud-run-job-workflow-trigger.

  6. From the Event provider list, select Cloud Storage.

  7. From the Event list, select google.cloud.storage.object.v1.finalized.

  8. In the Bucket field, select the bucket containing the input data file. The bucket name has the form input-PROJECT_ID.

  9. Click Save trigger.

    The Eventarc trigger now appears in the Triggers section on the Edit Workflow page.

  10. Click Next.

  11. Click Deploy.

gcloud

Create an Eventarc trigger by running the following command:

gcloud eventarc triggers create cloud-run-job-workflow-trigger \
    --location=us \
    --destination-workflow=cloud-run-job-workflow  \
    --destination-workflow-location=us-central1 \
    --event-filters="type=google.cloud.storage.object.v1.finalized" \
    --event-filters="bucket=input-PROJECT_ID" \
    --service-account="PROJECT_NUMBER-compute@developer.gserviceaccount.com"

Replace the following:

  • PROJECT_ID: The ID of your Google Cloud project
  • PROJECT_NUMBER: The project number

Whenever a file is uploaded or overwritten in the Cloud Storage bucket containing the input data file, the workflow is executed with the corresponding Cloud Storage event as an argument.

Trigger the workflow

Test the end-to-end system by updating the input data file in Cloud Storage.

  1. Generate new data for the input file and upload it to Cloud Storage in the location expected by the Cloud Run job:

    base64 /dev/urandom | head -c 100000 >input_file.txt
    gsutil cp input_file.txt gs://input-PROJECT_ID/input_file.txt
    

    Replace PROJECT_ID with the ID of your Google Cloud project.

  2. Confirm that the Cloud Run job ran as expected by viewing the job executions:

    gcloud beta run jobs executions list --job=parallel-job
    

    You should see a new job execution in the output.

  3. When files other than the input data file are updated in the Cloud Storage bucket, the Cloud Run job should not be executed. Confirm the workflow's behavior by viewing its executions in the Google Cloud console:

    1. Go to the Workflows page:

      Go to Workflows

    2. Click the name of your workflow, such as cloud-run-job-workflow, to go to its details page.

    3. To view the workflow executions, click the Executions tab.

    Completed executions that invoked the Cloud Run job appear with a finish state, while executions that did not invoke the job appear with an end state.

Clean up

If you created a new project for this tutorial, delete the project. If you used an existing project and wish to keep it without the changes added in this tutorial, delete resources created for the tutorial.

Delete the project

The easiest way to eliminate billing is to delete the project that you created for the tutorial.

To delete the project:

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete tutorial resources

Delete the resources you created in this tutorial:

  1. Delete the Eventarc trigger:

    gcloud eventarc triggers delete cloud-run-job-workflow-trigger --location=us
    
  2. Delete the workflow:

    gcloud workflows delete cloud-run-job-workflow --location=us-central1
    
  3. Delete the Cloud Run job:

    gcloud beta run jobs delete parallel-job
    
  4. Delete the Cloud Storage bucket created for the input data:

    gsutil rm -r gs://input-PROJECT_ID
    

What's next