Create a BigQuery Azure connection
BigQuery Omni accesses Azure Storage data through connections. Each connection has its own unique Azure Active Directory (Azure AD) application. Permissions are granted to the applications through Azure identity and access management (IAM) roles. The granted Azure IAM roles determine what data BigQuery can access for each connection.
Before you begin
Ensure that you've created the following resources:
- A Google Cloud project with both BigQuery Omni and BigQuery Connection API enabled.
- An Azure tenant with an Azure subscription.
- An Azure Storage account:
- Set Account kind to Storage V2 (general purpose V2) or BlobStorage.
- Use a hierarchical namespace. BigQuery for Azure supports Azure Data Lake Storage (Gen2).
- Ensure that the data is populated in one of the supported formats.
- Ensure that the data is in the
azure-eastus2
region.
- Permission to create service principals and modify Azure IAM policies,
specifically the
Application.ReadWrite.All
policy and theAppRoleAssignment.ReadWrite.All
policy.
Creating the BigQuery Azure connection
To create a BigQuery Azure connection, use the
Google Cloud console or the bq
command-line tool:
Google Cloud console
To create a connection resource, in the Google Cloud console, go to the BigQuery page.
In the Add data
menu, select External data source.In the External data source pane, enter the following information:
- For Connection type, select Azure.
- For Connection ID, enter an identifier for the connection resource. You can use letters, numbers, dashes, and underscores.
- For Connection location, select
azure-eastus2
. - Optional: For Friendly name, enter a user-friendly name for
the connection, such as
My connection resource
. The friendly name can be any value that helps you identify the connection resource if you need to modify it later. - Optional: For Description, enter a description for the connection resource.
- For Azure tenant id, enter the Azure tenant ID, also called the directory ID.
Click Create connection.
Take note of the value in the Azure app ID field and the value in the Azure app name field for later use.
bq
bq mk --connection --connection_type='Azure' \ --tenant_id=TENANT_ID \ --location=AZURE_LOCATION \ CONNECTION_NAME
Replace the following:
TENANT_ID
: the tenant ID of the Azure directory that contains the Azure Storage account.AZURE_LOCATION
: must be the Azure region where your Azure Storage data is located. Only theazure-eastus2
region is supported.CONNECTION_NAME
: the name that you give this connection.
The command line shows the following output:
Please create a Service Principal in your directory for appId: APP_ID, and perform role assignment to app: APP_NAME to allow BigQuery to access your Azure data.
The output contains the following:
APP_ID
: the app ID that you need in order to create a service principal in your directory.APP_NAME
: the app name that you need in order to assign roles that allow BigQuery to access your Azure data.
Take note of the APP_ID
and the
APP_NAME
values for use in the next steps.
To get the output in JSON format, use the --format=json
parameter:
{"clientId": APP_ID, "application": APP_NAME}
Creating an Azure AD service principal
To create an Azure AD service principal, use the Google Cloud console, Azure PowerShell, the Azure command-line tool, or the Microsoft Graph REST API:
Google Cloud console
To create an Azure AD service principal directly, in the Google Cloud console, in the Connection info pane, click Create service principal.
Log in to your Azure account. On the Permissions Requested page that appears, click Accept.
Azure PowerShell
To create the service principle for the app ID APP_ID
that was returned previously, you can use Azure PowerShell. The user must have
Microsoft.directory/servicePrincipals/create
permissions.
Run the following command:
New-AzADServicePrincipal` -ApplicationId APP_ID` -SkipAssignment
Replace APP_ID
with the app ID that was returned
previously.
Azure CLI
To create the service principal for the app ID APP_ID
that was returned previously, you can use the Azure command-line tool.
The user must have Microsoft.directory/servicePrincipals/create
permissions.
Run the following command:
export SP_ID=$(az ad sp create --id APP_ID | jq --raw-output '.objectId')
Replace APP_ID
with the app ID that was returned
previously.
Microsoft REST API
To create the service principal for the app ID APP_ID
that was returned previously, you can send an HTTP request to the Microsoft
Graph REST API.
Retrieve an OAuth token for an application to call the
Microsoft Graph REST API.
The application that you use should have the Application.ReadWrite.All
application permission.
The TENANT_ID
should match the ID of the Azure directory that contains the
Azure Storage account.
To generate an OAuth token, run the following command:
export TOKEN=$(curl -X POST \ https://login.microsoftonline.com/TENANT_ID/oauth2/token \ -H 'cache-control: no-cache' \ -H 'content-type: application/x-www-form-urlencoded' \ --data-urlencode "grant_type=client_credentials" \ --data-urlencode "resource=https://graph.microsoft.com/" \ --data-urlencode "client_id=CLIENT_ID" \ --data-urlencode "client_secret=CLIENT_SECRET" \ | jq --raw-output '.access_token')
Replace the following:
TENANT_ID
: the tenant ID of the Azure directory that contains the Azure Storage account.CLIENT_ID
: the Client ID.CLIENT_SECRET
: the Client Secret.
Run the following command:
export APP_ID=APP_ID
Replace APP_ID
with the app ID that was returned.
To create a service principal by calling the Microsoft Graph REST API, run the following command:
export SP_ID=$(curl -X POST \ https://graph.microsoft.com/v1.0/serviceprincipals \ -H "authorization: Bearer ${TOKEN?}" \ -H 'cache-control: no-cache' \ -H 'content-type: application/json' \ -d "{ \"appId\": \"${APP_ID?}\" }" | jq --raw-output '.id')
Replace the following:
TOKEN
: the token that you retrieved in the previous step.APP_ID
: the app ID that was returned previously.
Assigning a role to BigQuery's Azure AD applications
To assign a role to BigQuery's Azure AD application, use the Azure Portal, the Azure PowerShell, or the Microsoft Management REST API:
Azure Portal
You can perform role assignments in the Azure Portal by logging in as a user
with the Microsoft.Authorization/roleAssignments/write
permission. The role
assignment allows the BigQuery Azure connection to access the
Azure Storage data as specified in the roles policy.
To add role assignments using the Azure Portal, follow these steps:
From your Azure Storage account, enter
IAM
in the search bar. Click Access Control (IAM).Click Add and select Add role assignments.
To provide read-only access, select the role Storage Blob Data Reader. To provide read-write access, select the role Storage Blob Data Contributor.
Set Assign access to to User, group, or service principal.
In the Select input field, search for the app name
APP_NAME
that was returned previously, and select the matching result.Click Save.
Azure PowerShell
To add a role assignment for a service principal at a resource scope, you can use Azure PowerShell.
Run the following command:
New-AzRoleAssignment` -SignInName APP_NAME` -RoleDefinitionName ROLE_NAME` -ResourceName RESOURCE_NAME` -ResourceType RESOURCE_TYPE` -ParentResource PARENT_RESOURCE` -ResourceGroupName RESOURCE_GROUP_NAME
Replace the following:
APP_NAME
: the application name.ROLE_NAME
: the role name you want to assign.RESOURCE_NAME
: the resource name.RESOURCE_TYPE
: the resource type.PARENT_RESOURCE
: the parent resource.RESOURCE_GROUP_NAME
: the resource group name.
Azure CLI
To add a role assignment
for a service principal at a resource scope, you can use the
Azure command-line tool.
The following command grants the Storage Blob Data Contributor
role to a
storage account. The user must have the
Microsoft.Authorization/roleAssignments/write
permission for the storage
account.
To assign a role, such as the Storage Blob Data Contributor
role, to the
service principal, run the following command:
az role assignment create --role "Storage Blob Data Contributor" \ --assignee-object-id ${SP_ID} \ --assignee-principal-type ServicePrincipal \ --scope subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME
Replace the following:
SP_ID
: the service principle IDSTORAGE_ACCOUNT_NAME
: the storage account nameRESOURCE_GROUP_NAME
: the resource group nameSUBSCRIPTION_ID
: the subscription ID
Microsoft REST API
To add role assignments for a service principal, you can send an HTTP request to Microsoft Management.
To call the Microsoft Graph REST API,
retrieve an OAuth token for an application. The application used to call the
Microsoft Graph REST API should have the Application.ReadWrite.All
application permission.
To generate an OAuth token, run the following command:
export TOKEN=$(curl -X POST \ https://login.microsoftonline.com/TENANT_ID/oauth2/token \ -H 'cache-control: no-cache' \ -H 'content-type: application/x-www-form-urlencoded' \ --data-urlencode "grant_type=client_credentials" \ --data-urlencode "resource=https://graph.microsoft.com/" \ --data-urlencode "client_id=CLIENT_ID" \ --data-urlencode "client_secret=CLIENT_SECRET" \ | jq --raw-output '.access_token')
Replace the following:
TENANT_ID
: the tenant ID matches the ID of the Azure directory that contains the Azure Storage account.CLIENT_ID
: Azure client ID.CLIENT_SECRET
: Azure client secret.
Get the ID of the Azure built-in roles that you want to assign to the service principal.
These are some common roles:
* Storage Blob Data Contributor:
ba92f5b4-2d11-453d-a403-e96b0029c9fe
* Storage Blob Data Reader:
2a2b9908-6ea1-4ae2-8e65-a410df84e7d1
To assign a role to the service principle, call the Microsoft Graph REST API to the Azure Resource Management REST API:
export ROLE_ASSIGNMENT_ID=$(uuidgen) curl -X PUT \ 'https://management.azure.com/subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME/providers/Microsoft.Authorization/roleAssignments/ROLE_ASSIGNMENT_ID?api-version=2018-01-01-preview' \ -H "authorization: Bearer ${TOKEN?}" \ -H 'cache-control: no-cache' \ -H 'content-type: application/json' \ -d '{ "properties": { "roleDefinitionId": "subscriptions/SUBSCRIPTION_ID/resourcegroups/RESOURCE_GROUP_NAME/providers/Microsoft.Storage/storageAccounts/STORAGE_ACCOUNT_NAME/providers/Microsoft.Authorization/roleDefinitions/ROLE_ID", "principalId": "SP_ID" } }'
Replace the following:
- ROLE_ASSIGNMENT_ID
: the role ID
- SP_ID
: the service principle ID
- SUBSCRIPTION_ID
: the subscription ID
- RESOURCE_GROUP_NAME
: the resource group name
- STORAGE_ACCOUNT_NAME
:
- SUBSCRIPTION_ID
: the subscription ID
The connection is now ready to use.
What's next
- For more information about working with connections, see Working with connections.