Transcribe audio files with the ML.TRANSCRIBE function
This document describes how to use the
ML.TRANSCRIBE
function
with a
remote model
to transcribe audio files from an
object table.
Supported locations
You must create the remote model used in this procedure in one of the following locations:
asia-northeast1
asia-south1
asia-southeast1
australia-southeast1
eu
europe-west1
europe-west2
europe-west3
europe-west4
northamerica-northeast1
us
us-central1
us-east1
us-east4
us-west1
You must run
the ML.TRANSCRIBE
function in the same region as the remote model.
Required permissions
To work with a Speech-to-Text recognizer, you need the following roles:
speech.recognizers.create
speech.recognizers.get
speech.recognizers.recognize
speech.recognizers.update
To create a connection, you need membership in the following role:
roles/bigquery.connectionAdmin
To create the model using BigQuery ML, you need the following permissions:
bigquery.jobs.create
bigquery.models.create
bigquery.models.getData
bigquery.models.updateData
bigquery.models.updateMetadata
To run inference, you need the following permissions:
bigquery.tables.getData
on the object tablebigquery.models.getData
on the modelbigquery.jobs.create
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the BigQuery, BigQuery Connection API, and Speech-to-Text APIs.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the BigQuery, BigQuery Connection API, and Speech-to-Text APIs.
Create a recognizer
Speech-to-Text supports resources called recognizers. Recognizers represent stored and reusable recognition configurations. You can create a recognizer to logically group together transcriptions or traffic for your application.
Creating a speech recognizer is optional. If you choose to create a speech
recognizer, note the project ID, location, and recognizer ID of the recognizer
for use in the CREATE MODEL
statement, as described in
SPEECH_RECOGNIZER
.
If you choose not to create a speech recognizer, you must specify a value
for the
recognition_config
argument
of the ML.TRANSCRIBE
function.
You can only use the chirp
transcription model
in the speech recognizer or recognition_config
value that you provide.
Create a connection
Create a cloud resource connection and get the connection's service account.
Select one of the following options:
Console
Go to the BigQuery page.
To create a connection, click
Add, and then click Connections to external data sources.In the Connection type list, select Vertex AI remote models, remote functions and BigLake (Cloud Resource).
In the Connection ID field, enter a name for your connection.
Click Create connection.
Click Go to connection.
In the Connection info pane, copy the service account ID for use in a later step.
bq
In a command-line environment, create a connection:
bq mk --connection --location=REGION --project_id=PROJECT_ID \ --connection_type=CLOUD_RESOURCE CONNECTION_ID
The
--project_id
parameter overrides the default project.Replace the following:
REGION
: your connection regionPROJECT_ID
: your Google Cloud project IDCONNECTION_ID
: an ID for your connection
When you create a connection resource, BigQuery creates a unique system service account and associates it with the connection.
Troubleshooting: If you get the following connection error, update the Google Cloud SDK:
Flags parsing error: flag --connection_type=CLOUD_RESOURCE: value should be one of...
Retrieve and copy the service account ID for use in a later step:
bq show --connection PROJECT_ID.REGION.CONNECTION_ID
The output is similar to the following:
name properties 1234.REGION.CONNECTION_ID {"serviceAccountId": "connection-1234-9u56h9@gcp-sa-bigquery-condel.iam.gserviceaccount.com"}
Terraform
Append the following section into your main.tf
file.
## This creates a cloud resource connection. ## Note: The cloud resource nested object has only one output only field - serviceAccountId. resource "google_bigquery_connection" "connection" { connection_id = "CONNECTION_ID" project = "PROJECT_ID" location = "REGION" cloud_resource {} }
CONNECTION_ID
: an ID for your connectionPROJECT_ID
: your Google Cloud project IDREGION
: your connection region
Grant access to the service account
Select one of the following options:
Console
Go to the IAM & Admin page.
Click
Grant Access.The Add principals dialog opens.
In the New principals field, enter the service account ID that you copied earlier.
Click the Select a role field and then type
Cloud Speech Client
in Filter.Click Add another role.
In the Select a role field, select Cloud Storage, and then select Storage Object Viewer.
Click Save.
gcloud
Use the
gcloud projects add-iam-policy-binding
command:
gcloud projects add-iam-policy-binding 'PROJECT_NUMBER' --member='serviceAccount:MEMBER' --role='roles/speech.client' --condition=None gcloud projects add-iam-policy-binding 'PROJECT_NUMBER' --member='serviceAccount:MEMBER' --role='roles/storage.objectViewer' --condition=None
Replace the following:
PROJECT_NUMBER
: your project number.MEMBER
: the service account ID that you copied earlier.
Failure to grant the permission results in a Permission denied
error.
Create a dataset
Create a dataset to contain the model and the object table.
Create an object table
Create an object table over a set of audio files in Cloud Storage. The audio files in the object table must be of a supported type.
The Cloud Storage bucket used by the object table should be in the
same project where you plan to create the model and call the
ML.TRANSCRIBE
function. If you want to call the
ML.TRANSCRIBE
function in a different project than the one
that contains the Cloud Storage bucket used by the object table, you must
grant the Storage Admin role at the bucket level
to the service-A@gcp-sa-aiplatform.iam.gserviceaccount.com
service account.
Create a model
Create a remote model with a
REMOTE_SERVICE_TYPE
of
CLOUD_AI_SPEECH_TO_TEXT_V2
:
CREATE OR REPLACE MODEL `PROJECT_ID.DATASET_ID.MODEL_NAME` REMOTE WITH CONNECTION `PROJECT_ID.REGION.CONNECTION_ID` OPTIONS ( REMOTE_SERVICE_TYPE = 'CLOUD_AI_SPEECH_TO_TEXT_V2', SPEECH_RECOGNIZER = 'projects/PROJECT_NUMBER/locations/LOCATION/recognizers/RECOGNIZER_ID' );
Replace the following:
PROJECT_ID
: your project ID.DATASET_ID
: the ID of the dataset to contain the model.MODEL_NAME
: the name of the model.REGION
: the region used by the connection.CONNECTION_ID
: the connection ID—for example,myconnection
.When you view the connection details in the Google Cloud console, the connection ID is the value in the last section of the fully qualified connection ID that is shown in Connection ID—for example
projects/myproject/locations/connection_location/connections/myconnection
.PROJECT_NUMBER
: the project number of the project that contains the speech recognizer. You can find this value on the Project info card in the Dashboard page of the Google Cloud console.LOCATION
: the location used by the speech recognizer. You can find this value in the Location field on the List recognizers page of the Google Cloud console.RECOGNIZER_ID
: the speech recognizer ID. You can find this value in the ID field on the List recognizers page of the Google Cloud console.This option isn't required. If you don't specify a value for it, a default recognizer is used. In that case, you must specify a value for the
recognition_config
parameter of theML.TRANSCRIBE
function in order to provide a configuration for the default recognizer.You can only use the
chirp
transcription model in therecognition_config
value that you provide.
Transcribe audio files
Transcribe audio files with the ML.TRANSCRIBE
function:
SELECT * FROM ML.TRANSCRIBE( MODEL `PROJECT_ID.DATASET_ID.MODEL_NAME`, TABLE `PROJECT_ID.DATASET_ID.OBJECT_TABLE_NAME`, RECOGNITION_CONFIG => ( JSON 'recognition_config') );
Replace the following:
PROJECT_ID
: your project ID.DATASET_ID
: the ID of the dataset that contains the model.MODEL_NAME
: the name of the model.OBJECT_TABLE_NAME
: the name of the object table that contains the URIs of the audio files to process.recognition_config
: aRecognitionConfig
resource in JSON format.If a recognizer has been specified for the remote model by using the
SPEECH_RECOGNIZER
option, you can't specify arecognition_config
value.If no recognizer has been specified for the remote model by using the
SPEECH_RECOGNIZER
option, you must specify arecognition_config
value. This value is used to provide a configuration for the default recognizer.You can only use the
chirp
transcription model in therecognition_config
value that you provide.
Examples
Example 1
The following example transcribes the audio files represented by the
audio
table without overriding the recognizer's default configuration:
SELECT * FROM ML.TRANSCRIBE( MODEL `myproject.mydataset.transcribe_model`, TABLE `myproject.mydataset.audio` );
The following example transcribes the audio files represented by the
audio
table and provides a configuration for the default recognizer:
SELECT * FROM ML.TRANSCRIBE( MODEL `myproject.mydataset.transcribe_model`, TABLE `myproject.mydataset.audio`, recognition_config => ( JSON '{"language_codes": ["en-US" ],"model": "chirp","auto_decoding_config": {}}') );
What's next
- For information about model inference in BigQuery ML, see Model inference overview.
- For information about the supported SQL statements and functions for each model type, see End-to-end user journey for each model.