Authenticate to Dataflow

This document describes how to authenticate to Dataflow programmatically. How you authenticate to Dataflow depends on the interface you use to access the API and the environment where your code is running.

For more information about Google Cloud authentication, see the authentication overview.

API access

Dataflow supports programmatic access. You can access the API in the following ways:

Client libraries

The Dataflow client libraries provide high-level language support for authenticating to Dataflow programmatically. To authenticate calls to Google Cloud APIs, client libraries support Application Default Credentials (ADC); the libraries look for credentials in a set of defined locations and use those credentials to authenticate requests to the API. With ADC, you can make credentials available to your application in a variety of environments, such as local development or production, without needing to modify your application code.

Google Cloud CLI

When you use the gcloud CLI to access Dataflow, you log in to the gcloud CLI with a user account, which provides the credentials used by the gcloud CLI commands.

If your organization's security policies prevent user accounts from having the required permissions, you can use service account impersonation.

For more information, see Authenticate for using the gcloud CLI. For more information about using the gcloud CLI with Dataflow, see the gcloud CLI reference pages.

REST

You can authenticate to the Dataflow API by using your gcloud CLI credentials or by using Application Default Credentials. For more information about authentication for REST requests, see Authenticate for using REST. For information about the types of credentials, see gcloud CLI credentials and ADC credentials.

Set up authentication for Dataflow

How you set up authentication depends on the environment where your code is running.

The following options for setting up authentication are the most commonly used. For more options and information about authentication, see Authentication methods.

For a local development environment

You can set up credentials for a local development environment in the following ways:

Client libraries or third-party tools

Set up Application Default Credentials (ADC) in your local environment:

  1. Install the Google Cloud CLI, then initialize it by running the following command:

    gcloud init
  2. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

    A sign-in screen appears. After you sign in, your credentials are stored in the local credential file used by ADC.

For more information about working with ADC in a local environment, see Local development environment.

REST requests from the command line

When you make a REST request from the command line, you can use your gcloud CLI credentials by including gcloud auth print-access-token as part of the command that sends the request.

The following example lists service accounts for the specified project. You can use the same pattern for any REST request.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your Google Cloud project ID.

To send your request, expand one of these options:

 

For more information about authenticating using REST and gRPC, see Authenticate for using REST. For information about the difference between your local ADC credentials and your gcloud CLI credentials, see gcloud CLI authentication configuration and ADC configuration.

On Google Cloud

To authenticate a workload running on Google Cloud, you use the credentials of the service account attached to the compute resource where your code is running, such as a Compute Engine virtual machine (VM) instance. This approach is the preferred authentication method for code running on a Google Cloud compute resource.

For most services, you must attach the service account when you create the resource that will run your code; you cannot add or replace the service account later. Compute Engine is an exception—it lets you attach a service account to a VM instance at any time.

Use the gcloud CLI to create a service account and attach it to your resource:

  1. Install the Google Cloud CLI, then initialize it by running the following command:

    gcloud init
  2. Set up authentication:

    1. Create the service account:

      gcloud iam service-accounts create SERVICE_ACCOUNT_NAME

      Replace SERVICE_ACCOUNT_NAME with a name for the service account.

    2. To provide access to your project and your resources, grant a role to the service account:

      gcloud projects add-iam-policy-binding PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com" --role=ROLE

      Replace the following:

      • SERVICE_ACCOUNT_NAME: the name of the service account
      • PROJECT_ID: the project ID where you created the service account
      • ROLE: the role to grant
    3. To grant another role to the service account, run the command as you did in the previous step.
    4. Grant the required role to the principal that will attach the service account to other resources.

      gcloud iam service-accounts add-iam-policy-binding SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com --member="user:USER_EMAIL" --role=roles/iam.serviceAccountUser

      Replace the following:

      • SERVICE_ACCOUNT_NAME: the name of the service account
      • PROJECT_ID: the project ID where you created the service account
      • USER_EMAIL: the email address for a Google Account
  3. Create the resource that will run your code, and attach the service account to that resource. For example, if you use Compute Engine:

    Create a Compute Engine instance. Configure the instance as follows:
    • Replace INSTANCE_NAME with your preferred instance name.
    • Set the --zone flag to the zone in which you want to create your instance.
    • Set the --service-account flag to the email address for the service account that you created.
    gcloud compute instances create INSTANCE_NAME --zone=ZONE --service-account=SERVICE_ACCOUNT_EMAIL

For more information about authenticating to Google APIs, see Authentication methods.

On-premises or on a different cloud provider

The preferred method to set up authentication from outside of Google Cloud is to use workload identity federation. For more information, see On-premises or another cloud provider in the authentication documentation.

Access control for Dataflow

After you authenticate to Dataflow, you must be authorized to access Google Cloud resources. Dataflow uses Identity and Access Management (IAM) for authorization.

For more information about the roles for Dataflow, see Access control with IAM. For more information about IAM and authorization, see IAM overview.

What's next