Create a BigQuery Azure connection

BigQuery Omni accesses Azure Storage data through connections. Each connection has its own unique Azure Active Directory (Azure AD) application. Permissions are granted to the applications through Azure identity and access management (IAM) roles. The granted Azure IAM roles determine what data BigQuery can access for each connection.

Before you begin

Ensure that you've created the following resources:

Creating the BigQuery Azure connection

To create a BigQuery Azure connection, use the Google Cloud console or the bq command-line tool:

Google Cloud console

  1. To create a connection resource, in the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the Add data menu, select External data source.

  3. In the External data source pane, enter the following information:

    • For Connection type, select Azure.
    • For Connection ID, enter an identifier for the connection resource. You can use letters, numbers, dashes, and underscores.
    • For Connection location, select azure-eastus2.
    • Optional: For Friendly name, enter a user-friendly name for the connection, such as My connection resource. The friendly name can be any value that helps you identify the connection resource if you need to modify it later.
    • Optional: For Description, enter a description for the connection resource.
    • For Azure tenant id, enter the Azure tenant ID, also called the directory ID.
  4. Click Create connection.

    Take note of the value in the Azure app ID field and the value in the Azure app name field for later use.

bq

bq mk --connection --connection_type='Azure' \
--tenant_id=TENANT_ID \
--location=AZURE_LOCATION \
CONNECTION_NAME

Replace the following:

  • TENANT_ID: the tenant ID of the Azure directory that contains the Azure Storage account.
  • AZURE_LOCATION: must be the Azure region where your Azure Storage data is located. Only the azure-eastus2 region is supported.
  • CONNECTION_NAME: the name that you give this connection.

The command line shows the following output:

  Please create a Service Principal in your directory for
  appId: APP_ID,
  and perform role assignment to
  app: APP_NAME
  to allow BigQuery to access your Azure data.

The output contains the following:

  • APP_ID: the app ID that you need in order to create a service principal in your directory.
  • APP_NAME: the app name that you need in order to assign roles that allow BigQuery to access your Azure data.

Take note of the APP_ID and the APP_NAME values for use in the next steps.

To get the output in JSON format, use the --format=json parameter:

  {"clientId": APP_ID, "application": APP_NAME}

Creating an Azure AD service principal

To create an Azure AD service principal, use the Google Cloud console, Azure PowerShell, the Azure command-line tool, or the Microsoft Graph REST API:

Google Cloud console

To create an Azure AD service principal directly, in the Google Cloud console, in the Connection info pane, click Create service principal.

Log in to your Azure account. On the Permissions Requested page that appears, click Accept.

Azure PowerShell

To create the service principle for the app ID APP_ID that was returned previously, you can use Azure PowerShell. The user must have Microsoft.directory/servicePrincipals/create permissions.

Run the following command:

New-AzADServicePrincipal`
-ApplicationId APP_ID`
-SkipAssignment

Replace APP_ID with the app ID that was returned previously.

Azure CLI

To create the service principal for the app ID APP_ID that was returned previously, you can use the Azure command-line tool. The user must have Microsoft.directory/servicePrincipals/create permissions.

Run the following command:

export SP_ID=$(az ad sp create --id APP_ID | jq --raw-output '.objectId')

Replace APP_ID with the app ID that was returned previously.

Microsoft REST API

To create the service principal for the app ID APP_ID that was returned previously, you can send an HTTP request to the Microsoft Graph REST API.

Retrieve an OAuth token for an application to call the Microsoft Graph REST API. The application that you use should have the Application.ReadWrite.All application permission.

The TENANT_ID should match the ID of the Azure directory that contains the Azure Storage account.

To generate an OAuth token, run the following command:

export TOKEN=$(curl -X POST \
https://login.microsoftonline.com/TENANT_ID/oauth2/token \
-H 'cache-control: no-cache' \
-H 'content-type: application/x-www-form-urlencoded' \
--data-urlencode "grant_type=client_credentials" \
--data-urlencode "resource=https://graph.microsoft.com/" \
--data-urlencode "client_id=CLIENT_ID" \
--data-urlencode "client_secret=CLIENT_SECRET" \
| jq --raw-output '.access_token')

Replace the following:

  • TENANT_ID: the tenant ID of the Azure directory that contains the Azure Storage account.
  • CLIENT_ID: the Client ID.
  • CLIENT_SECRET: the Client Secret.

Run the following command:

export APP_ID=APP_ID

Replace APP_ID with the app ID that was returned.

To create a service principal by calling the Microsoft Graph REST API, run the following command:

export SP_ID=$(curl -X POST \
https://graph.microsoft.com/v1.0/serviceprincipals \
-H "authorization: Bearer ${TOKEN?}" \
-H 'cache-control: no-cache' \
-H 'content-type: application/json' \
-d "{
  \"appId\": \"${APP_ID?}\"
}" | jq --raw-output '.id')

Replace the following:

  • TOKEN: the token that you retrieved in the previous step.
  • APP_ID: the app ID that was returned previously.

Assigning a role to BigQuery's Azure AD applications

To assign a role to BigQuery's Azure AD application, use the Azure Portal, the Azure PowerShell, or the Microsoft Management REST API:

Azure Portal

You can perform role assignments in the Azure Portal by logging in as a user with the Microsoft.Authorization/roleAssignments/write permission. The role assignment allows the BigQuery Azure connection to access the Azure Storage data as specified in the roles policy.

To add role assignments using the Azure Portal, follow these steps:

  1. From your Azure Storage account, enter IAM in the search bar. Click Access Control (IAM).

  2. Click Add and select Add role assignments.

  3. To provide read-only access, select the role Storage Blob Data Reader. To provide read-write access, select the role Storage Blob Data Contributor.

  4. Set Assign access to to User, group, or service principal.

  5. In the Select input field, search for the app name APP_NAME that was returned previously, and select the matching result.

  6. Click Save.

Azure PowerShell

To add a role assignment for a service principal at a resource scope, you can use Azure PowerShell.

Run the following command:

New-AzRoleAssignment`
 -SignInName APP_NAME`
 -RoleDefinitionName ROLE_NAME`
 -ResourceName RESOURCE_NAME`
 -ResourceType RESOURCE_TYPE`
 -ParentResource PARENT_RESOURCE`
 -ResourceGroupName RESOURCE_GROUP_NAME

Replace the following:

  • APP_NAME: the application name.
  • ROLE_NAME: the role name you want to assign.
  • RESOURCE_NAME: the resource name.
  • RESOURCE_TYPE: the resource type.
  • PARENT_RESOURCE: the parent resource.
  • RESOURCE_GROUP_NAME: the resource group name.

Azure CLI

To add a role assignment for a service principal at a resource scope, you can use the Azure command-line tool. The following command grants the Storage Blob Data Contributor role to a storage account. The user must have the Microsoft.Authorization/roleAssignments/write permission for the storage account.

To assign a role, such as the Storage Blob Data Contributor role, to the service principal, run the following command:

az role assignment create --role "Storage Blob Data Contributor" \
  --assignee-object-id ${SP_ID} \
  --assignee-principal-type ServicePrincipal \
  --scope subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME

Replace the following:

  • SP_ID: the service principle ID
  • STORAGE_ACCOUNT_NAME: the storage account name
  • RESOURCE_GROUP_NAME: the resource group name
  • SUBSCRIPTION_ID: the subscription ID

Microsoft REST API

To add role assignments for a service principal, you can send an HTTP request to Microsoft Management.

To call the Microsoft Graph REST API, retrieve an OAuth token for an application. The application used to call the Microsoft Graph REST API should have the Application.ReadWrite.All application permission.

To generate an OAuth token, run the following command:

export TOKEN=$(curl -X POST \
https://login.microsoftonline.com/TENANT_ID/oauth2/token \
-H 'cache-control: no-cache' \
-H 'content-type: application/x-www-form-urlencoded' \
--data-urlencode "grant_type=client_credentials" \
--data-urlencode "resource=https://graph.microsoft.com/" \
--data-urlencode "client_id=CLIENT_ID" \
--data-urlencode "client_secret=CLIENT_SECRET" \
| jq --raw-output '.access_token')

Replace the following:

  • TENANT_ID: the tenant ID matches the ID of the Azure directory that contains the Azure Storage account.
  • CLIENT_ID: Azure client ID.
  • CLIENT_SECRET: Azure client secret.

Get the ID of the Azure built-in roles that you want to assign to the service principal.

These are some common roles: * Storage Blob Data Contributor: ba92f5b4-2d11-453d-a403-e96b0029c9fe * Storage Blob Data Reader: 2a2b9908-6ea1-4ae2-8e65-a410df84e7d1

To assign a role to the service principle, call the Microsoft Graph REST API to the Azure Resource Management REST API:

export ROLE_ASSIGNMENT_ID=$(uuidgen)
curl -X PUT \
'https://management.azure.com/subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME/providers/Microsoft.Authorization/roleAssignments/ROLE_ASSIGNMENT_ID?api-version=2018-01-01-preview' \
  -H "authorization: Bearer ${TOKEN?}" \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '{
      "properties": {
          "roleDefinitionId": "subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME/providers/Microsoft.Authorization/roleDefinitions/ROLE_ID",
          "principalId": "SP_ID"
      }
  }'

Replace the following: - ROLE_ASSIGNMENT_ID: the role ID - SP_ID: the service principle ID - SUBSCRIPTION_ID: the subscription ID - RESOURCE_GROUP_NAME: the resource group name - STORAGE_ACCOUNT_NAME: - SUBSCRIPTION_ID: the subscription ID

The connection is now ready to use.

What's next