Stay organized with collections Save and categorize content based on your preferences.

Create a BigQuery Azure connection

BigQuery Omni accesses Azure Blob Storage data through connections. BigQuery Omni provides two methods for securely accessing data from Azure Blob Storage. You can either use identity federation by granting a Google Cloud service account access to your Azure application, or you can directly grant access to your Azure Active Directory (AD) application in your tenant:

  • Use an Azure federated identity. BigQuery Omni support for Azure workload identity federation is in preview. With BigQuery Omni supporting Azure workload identity federation, you can grant a Google service account access to the Azure application that's within your tenant. Federated identity access is more secure than a non-federated access because the application client exists within your Azure tenant. The application client's secrets are not managed by you or Google.

    With identity federation, you have more control over the application because your application to which you have granted access to your data exists within your tenant.

  • Use a non-federated identity. Each connection has its own unique Azure Active Directory (Azure AD) application. Permissions are granted to the applications through Azure Identity and Access Management (IAM) roles. The granted Azure IAM roles determine what data BigQuery can access for each connection.

After you create a BigQuery Azure connection, you can either query the data in the Azure Blob storage or export data from Azure to BigQuery.

Before you begin

Quotas

Your project can create up to 50 Azure connections for each project. The quota includes both federated and non-federated Azure connections. For more information about quotas, see BigQuery Connection API.

Use an Azure federated identity

BigQuery Omni support for Azure workload identity federation is in preview. To access data using a federated identity, follow these steps:

  1. Create an application in your Azure tenant.
  2. Create the BigQuery Azure connection.
  3. Add a federated credential.
  4. Assign a role to BigQuery's Azure AD applications.

For more information about using federated identity credentials to access data in Azure, see Workload identity federation.

Create an application in your Azure tenant

To create an application in your Azure tenant, follow these steps:

  1. In the Azure portal, go to App registrations, and then click New registration.

  2. For Names, enter a name for your application.

  3. For Supported account types, select Accounts in this organizational directory only.

  4. To register the new application, click Register.

  5. Make a note of the Application (client) ID. You need to provide this ID when you create a BigQuery Azure connection.

    Azure portal for creating applications

For more information, see how to register an application in Azure.

Create the BigQuery Azure connection

To create a BigQuery Azure connection, use the bq mk command.

    bq mk --connection --connection_type='Azure' \
      --tenant_id=TENANT_ID \
      --location=AZURE_LOCATION \
      --federated_azure=true \
      --federated_app_client_id=APP_ID \
      --project_id=PROJECT_ID \
      CONNECTION_NAME
    

Replace the following:

  • TENANT_ID: the tenant ID of the Azure directory that contains the Azure Storage account.
  • AZURE_LOCATION: the Azure region where your Azure Storage data is located. BigQuery Omni supports the azure-eastus2 region.
  • APP_ID: the Azure Application (client) ID. To learn how to get this ID, see Create application in Azure tenant.
  • CONNECTION_NAME: the name of the connection.
  • PROJECT_ID: Optional: the ID of your Cloud project.

The output is similar to the following:

Connection CONNECTION_NAME successfully created
Please add the following identity to your Azure application APP_ID
Identity: SUBJECT_ID

This output includes the following values:

  • APP_ID: the ID of the application that you created.
  • SUBJECT_ID: the ID of the Google Cloud service account that the user authorizes to access their application. This value is required when you create a federated credential in Azure.

Note the APP_ID and the SUBJECT_ID values for use in the next steps.

Next, add a federated credential for your application.

Add a federated credential

To create a federated credential, follow these steps:

  1. In the Azure portal, go to App registrations, and then click your application.

  2. Select Certificates & secrets > Federated credentials > Add credentials. Then, do the following:

    1. From the Federated credential scenario list, select Other issuer.

    2. For Issuer, enter https://accounts.google.com.

    3. For Subject identifier, enter the ID of the Google Cloud service account that you got when you created the BigQuery Azure connection.

    4. For Name, enter a name for the credential.

    5. Click Add.

For more information, see Configure an app to trust an external identity provider (preview).

Assign a role to BigQuery's Azure applications

To assign a role to BigQuery's Azure application, use the Azure Portal, the Azure PowerShell, or the Microsoft Management REST API:

Azure Portal

You can perform role assignments in the Azure Portal by logging in as a user with the Microsoft.Authorization/roleAssignments/write permission. The role assignment lets the BigQuery Azure connection access the Azure Storage data as specified in the roles policy.

To add role assignments using the Azure Portal, follow these steps:

  1. From your Azure Storage account, enter IAM in the search bar.

  2. Click Access Control (IAM).

  3. Click Add and select Add role assignments.

  4. To provide read-only access, select the Storage Blob Data Reader role. To provide read-write access, select the Storage Blob Data Contributor role.

  5. Set Assign access to to User, group, or service principal.

  6. Click Select members.

  7. In the Select field, enter the Azure application name that you gave when you created the application in the Azure tenant.

  8. Click Save.

For more information, see Assign Azure roles using the Azure portal.

Azure PowerShell

To add a role assignment for a service principal at a resource scope, you can use the New-AzRoleAssignment command:

  New-AzRoleAssignment`
   -SignInName APP_NAME`
   -RoleDefinitionName ROLE_NAME`
   -ResourceName RESOURCE_NAME`
   -ResourceType RESOURCE_TYPE`
   -ParentResource PARENT_RESOURCE`
   -ResourceGroupName RESOURCE_GROUP_NAME

Replace the following:

  • APP_NAME: the application name.
  • ROLE_NAME: the role name you want to assign.
  • RESOURCE_NAME: the resource name.
  • RESOURCE_TYPE: the resource type.
  • PARENT_RESOURCE: the parent resource.
  • RESOURCE_GROUP_NAME: the resource group name.

For more information about using Azure PowerShell to add a new service principal, see the Assign Azure roles using Azure PowerShell.

Azure CLI

To add a role assignment for a service principal at a resource scope, you can use the Azure command-line tool. You must have the Microsoft.Authorization/roleAssignments/write permission for the storage account to grant roles.

To assign a role, such as the Storage Blob Data Contributor role, to the service principal, run the az role assignment create command:

  az role assignment create --role "Storage Blob Data Contributor" \
    --assignee-object-id ${SP_ID} \
    --assignee-principal-type ServicePrincipal \
    --scope   subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME

Replace the following:

  • SP_ID: the service principal ID. This service principal is for the application that you created. To get the service principal for a federated connection, see Service principal object.
  • STORAGE_ACCOUNT_NAME: the storage account name.
  • RESOURCE_GROUP_NAME: the resource group name.
  • SUBSCRIPTION_ID: the subscription ID.

For more information, see Assign Azure roles using Azure CLI.

Microsoft REST API

To add role assignments for a service principal, you can send an HTTP request to Microsoft Management.

To call the Microsoft Graph REST API, retrieve an OAuth token for an application. For more information, see Get access without a user. The application that called the Microsoft Graph REST API must have the Application.ReadWrite.All application permission.

To generate an OAuth token, run the following command:

  export TOKEN=$(curl -X POST \
    https://login.microsoftonline.com/TENANT_ID/oauth2/token \
    -H 'cache-control: no-cache' \
    -H 'content-type: application/x-www-form-urlencoded' \
    --data-urlencode "grant_type=client_credentials" \
    --data-urlencode "resource=https://graph.microsoft.com/" \
    --data-urlencode "client_id=CLIENT_ID" \
    --data-urlencode "client_secret=CLIENT_SECRET" \
  | jq --raw-output '.access_token')

Replace the following:

  • TENANT_ID: the tenant ID matching the ID of the Azure directory that contains the Azure Storage account.
  • CLIENT_ID: the Azure client ID.
  • CLIENT_SECRET: the Azure client secret.

Get the ID of the Azure built-in roles that you want to assign to the service principal.

These are some common roles:

To assign a role to the service principal, call the Microsoft Graph REST API to the Azure Resource Management REST API:

  export ROLE_ASSIGNMENT_ID=$(uuidgen)
  curl -X PUT \
'https://management.azure.com/subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME/providers/Microsoft.Authorization/roleAssignments/ROLE_ASSIGNMENT_ID?api-version=2018-01-01-preview' \
    -H "authorization: Bearer ${TOKEN?}" \
    -H 'cache-control: no-cache' \
    -H 'content-type: application/json' \
    -d '{
        "properties": {
            "roleDefinitionId": "subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME/providers/Microsoft.Authorization/roleDefinitions/ROLE_ID",
            "principalId": "SP_ID"
        }
    }'

Replace the following:

  • ROLE_ASSIGNMENT_ID: the role ID.
  • SP_ID: the service principal ID. This service principal is for the application that you created. To get the service principal for a federated connection, see Service principal object.
  • SUBSCRIPTION_ID: the subscription ID.
  • RESOURCE_GROUP_NAME: the resource group name.
  • STORAGE_ACCOUNT_NAME: the storage account name.
  • SUBSCRIPTION_ID: the subscription ID.

The connection is now ready to use. However, there might be a propagation delay for a role assignment in Azure. If you are not able to use the connection due to permission issues, then retry after some time.

Use a non-federated identity

To access data using a non-federated identity, follow these steps:

  1. Create the BigQuery Azure connection.
  2. Create an Azure AD service principal.
  3. Assign a role to BigQuery's Azure AD applications.

For more information about using non-federated identity to access data in Azure, see Azure AD application.

Create the BigQuery Azure connection

To create a BigQuery Azure connection, use the Google Cloud console or the bq command-line tool:

Console

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the Add data menu, select External data source.

  3. In the External data source pane, enter the following information:

    • For Connection type, select BigLake on Azure (via BigQuery Omni).
    • For Connection ID, enter an identifier for the connection resource. You can use letters, numbers, dashes, and underscores.
    • For Connection location, select azure-eastus2.
    • Optional: For Friendly name, enter a user-friendly name for the connection, such as My connection resource. The friendly name can be any value that helps you identify the connection resource if you need to modify it later.
    • Optional: For Description, enter a description for the connection resource.
    • For Azure tenant id, enter the Azure tenant ID, which is also referred to as the Directory (tenant) ID.

  4. Click Create connection.

  5. In the Explorer panel, click the dataset in which you created the connection ID, and then click the connection ID to view its details.

  6. In the Connection info section, note the values of Azure app ID and Azure app name. These values are required when you assign a role to BigQuery's Azure application.

bq

Use the bq mk command. To get the output in JSON format, use the --format=json parameter.

bq mk --connection --connection_type='Azure' \
  --tenant_id=TENANT_ID \
  --location=AZURE_LOCATION \
  CONNECTION_NAME

Replace the following:

  • TENANT_ID: the tenant ID of the Azure directory that contains the Azure Storage account.
  • AZURE_LOCATION: the Azure region where your Azure Storage data is located. BigQuery Omni supports the azure-eastus2 region.
  • CONNECTION_NAME: the name of the connection.

The output is similar to the following:

  Please create a Service Principal in your directory for
  appId: APP_ID,
  and perform role assignment to
  app: APP_NAME
  to allow BigQuery to access your Azure data.

This output includes the following values:

  • APP_ID: the ID of the application that you created.
  • APP_NAME: the app name that you need to assign roles that let BigQuery access your Azure data.

Note the APP_ID and the APP_NAME values for use in the next steps.

For more information, see Create the BigQuery Azure connection.

Create an Azure AD service principal

To create an Azure AD service principal, use the Google Cloud console, Azure PowerShell, the Azure command-line tool, or the Microsoft Graph REST API:

Console

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the Explorer panel, click the BigQuery Azure connection that you created.

  3. On the Connection info pane, click Login to Azure account.

  4. Log in to your Azure account.

  5. On the Permissions Requested page, click Accept.

Azure PowerShell

To create the service principal for the app ID APP_ID that was returned previously, the user must have the Microsoft.directory/servicePrincipals/create permission.

To create the service principal, run the New-AzADServicePrincipal command:

New-AzADServicePrincipal`
-ApplicationId APP_ID`
-SkipAssignment

Replace APP_ID with the app ID that was returned previously.

Azure CLI

To create the service principal for the app ID APP_ID that was returned previously, you can use the Azure command-line tool. You must have the Microsoft.directory/servicePrincipals/create permission to create a service principal.

To create the service principal, run the az ad sp command:

export SP_ID=$(az ad sp create --id APP_ID | jq --raw-output '.objectId')

Replace APP_ID with the app ID that was returned previously.

Microsoft REST API

To create the service principal for the app ID APP_ID that was returned previously, you can send an HTTP request to the Microsoft Graph REST API.

To call the Microsoft Graph REST API, retrieve an OAuth token for an application. For more information, see Get access without a user. The application used to call the Microsoft Graph REST API should have the Application.ReadWrite.All application permission.

The TENANT_ID should match the ID of the Azure directory that contains the Azure Storage account.

To generate an OAuth token, run the following command:

export TOKEN=$(curl -X POST \
https://login.microsoftonline.com/TENANT_ID/oauth2/token \
-H 'cache-control: no-cache' \
-H 'content-type: application/x-www-form-urlencoded' \
--data-urlencode "grant_type=client_credentials" \
--data-urlencode "resource=https://graph.microsoft.com/" \
--data-urlencode "client_id=CLIENT_ID" \
--data-urlencode "client_secret=CLIENT_SECRET" \
| jq --raw-output '.access_token')

Replace the following:

  • TENANT_ID: the tenant ID of the Azure directory that contains the Azure Storage account.
  • CLIENT_ID: the Azure client ID.
  • CLIENT_SECRET: the Azure client secret.

Run the following command:

export APP_ID=APP_ID

Replace APP_ID with the app ID that was returned.

To create a service principal by calling the Microsoft Graph REST API, run the following command:

export SP_ID=$(curl -X POST \
https://graph.microsoft.com/v1.0/serviceprincipals \
-H "authorization: Bearer ${TOKEN?}" \
-H 'cache-control: no-cache' \
-H 'content-type: application/json' \
-d "{
  \"appId\": \"${APP_ID?}\"
}" | jq --raw-output '.id')

Replace the following:

  • TOKEN: the OAuth token for the application.
  • APP_ID: the app ID that was returned previously.

Assign a role to BigQuery's Azure AD applications

Azure Portal

You can perform role assignments in the Azure Portal by logging in as a user with the Microsoft.Authorization/roleAssignments/write permission. The role assignment lets the BigQuery Azure connection access the Azure Storage data as specified in the roles policy.

To add role assignments using the Azure Portal, follow these steps:

  1. From your Azure Storage account, enter IAM in the search bar.

  2. Click Access Control (IAM).

  3. Click Add and select Add role assignments.

  4. To provide read-only access, select the Storage Blob Data Reader role. To provide read-write access, select the Storage Blob Data Contributor role.

  5. Set Assign access to to User, group, or service principal.

  6. Click Select members.

  7. In the Select field, enter the Azure app name that was returned when you created the BigQuery Azure connection.

  8. Click Save.

For more information, see Assign Azure roles using the Azure portal.

Azure PowerShell

To add a role assignment for a service principal at a resource scope, you can use the New-AzRoleAssignment command:

  New-AzRoleAssignment`
   -SignInName APP_NAME`
   -RoleDefinitionName ROLE_NAME`
   -ResourceName RESOURCE_NAME`
   -ResourceType RESOURCE_TYPE`
   -ParentResource PARENT_RESOURCE`
   -ResourceGroupName RESOURCE_GROUP_NAME

Replace the following:

  • APP_NAME: the application name.
  • ROLE_NAME: the role name you want to assign.
  • RESOURCE_NAME: the resource name.
  • RESOURCE_TYPE: the resource type.
  • PARENT_RESOURCE: the parent resource.
  • RESOURCE_GROUP_NAME: the resource group name.

For more information about using Azure PowerShell to add a new service principal, see the Assign Azure roles using Azure PowerShell.

Azure CLI

To add a role assignment for a service principal at a resource scope, you can use the Azure command-line tool. You must have the Microsoft.Authorization/roleAssignments/write permission for the storage account to grant roles.

To assign a role, such as the Storage Blob Data Contributor role, to the service principal, run the az role assignment create command:

  az role assignment create --role "Storage Blob Data Contributor" \
    --assignee-object-id ${SP_ID} \
    --assignee-principal-type ServicePrincipal \
    --scope   subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME

Replace the following:

  • SP_ID: the service principal ID.
  • STORAGE_ACCOUNT_NAME: the storage account name.
  • RESOURCE_GROUP_NAME: the resource group name.
  • SUBSCRIPTION_ID: the subscription ID.

For more information, see Assign Azure roles using Azure CLI.

Microsoft REST API

To add role assignments for a service principal, you can send an HTTP request to Microsoft Management.

To call the Microsoft Graph REST API, retrieve an OAuth token for an application. For more information, see Get access without a user. The application that called the Microsoft Graph REST API must have the Application.ReadWrite.All application permission.

To generate an OAuth token, run the following command:

  export TOKEN=$(curl -X POST \
    https://login.microsoftonline.com/TENANT_ID/oauth2/token \
    -H 'cache-control: no-cache' \
    -H 'content-type: application/x-www-form-urlencoded' \
    --data-urlencode "grant_type=client_credentials" \
    --data-urlencode "resource=https://graph.microsoft.com/" \
    --data-urlencode "client_id=CLIENT_ID" \
    --data-urlencode "client_secret=CLIENT_SECRET" \
  | jq --raw-output '.access_token')

Replace the following:

  • TENANT_ID: the tenant ID matching the ID of the Azure directory that contains the Azure Storage account.
  • CLIENT_ID: the Azure client ID.
  • CLIENT_SECRET: the Azure client secret.

Get the ID of the Azure built-in roles that you want to assign to the service principal.

These are some common roles:

To assign a role to the service principal, call the Microsoft Graph REST API to the Azure Resource Management REST API:

  export ROLE_ASSIGNMENT_ID=$(uuidgen)
  curl -X PUT \
'https://management.azure.com/subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME/providers/Microsoft.Authorization/roleAssignments/ROLE_ASSIGNMENT_ID?api-version=2018-01-01-preview' \
    -H "authorization: Bearer ${TOKEN?}" \
    -H 'cache-control: no-cache' \
    -H 'content-type: application/json' \
    -d '{
        "properties": {
            "roleDefinitionId": "subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME/providers/Microsoft.Authorization/roleDefinitions/ROLE_ID",
            "principalId": "SP_ID"
        }
    }'

Replace the following:

  • ROLE_ASSIGNMENT_ID: the role ID.
  • SP_ID: the service principal ID.
  • SUBSCRIPTION_ID: the subscription ID.
  • RESOURCE_GROUP_NAME: the resource group name.
  • STORAGE_ACCOUNT_NAME: the storage account name.
  • SUBSCRIPTION_ID: the subscription ID.

The connection is now ready to use. However, there might be a propagation delay for a role assignment in Azure. If you are not able to use the connection due to permission issues, then retry after some time.

What's next