This quickstart guides the Application Operator (AO) through the process of using the Vertex AI Speech-to-Text pre-trained API on Google Distributed Cloud (GDC) air-gapped.
Before you begin
Follow these steps before trying Speech-to-Text:
Set up a project using the GDC console to group the Vertex AI services. For information about creating and using projects, see Create a project.
Ask your Project IAM Admin to grant you the AI Speech Developer (
ai-speech-developer
) role in your project namespace.Download the gdcloud command-line interface (CLI).
Set up your service account
Set up your service account with the name of your service account, project ID,
and service key. Replace the PROJECT_ID
with your project.
${HOME}/gdcloud init # set URI and project
${HOME}/gdcloud auth login
${HOME}/gdcloud iam service-accounts create SERVICE_ACCOUNT --project=PROJECT_ID
${HOME}/gdcloud iam service-accounts keys create "SERVICE_KEY".json --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT
Grant access to project resources
Grant access to the Translation API service account by providing
your project ID, name of your service account, and the role ai-speech-developer
.
${HOME}/gdcloud iam service-accounts add-iam-policy-binding --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT --role=role/ai-speech-developer
Set your environment variables
Before running the Speech-to-Text pre-trained service, set your environment variable.
export GOOGLE_APPLICATION_CREDENTIALS="SERVICE_KEY".json
Authenticate the request
You must get a token to authenticate the requests to the Speech-to-Text pre-trained service. Follow these steps:
gdcloud CLI
Export the identity token for the specified account to an environment variable:
export TOKEN="$($HOME/gdcloud auth print-identity-token --audiences=https://ENDPOINT)"
Replace ENDPOINT
with the Speech-to-Text endpoint. For more information, view service statuses and endpoints.
Python
Install the
google-auth
client library.pip install google-auth
Save the following code to a Python script, and update the
ENDPOINT
to the Speech-to-Text endpoint. For more information, see View service statuses and endpoints.import google.auth from google.auth.transport import requests api_endpoint = "https://ENDPOINT" creds, project_id = google.auth.default() creds = creds.with_gdch_audience(api_endpoint) def test_get_token(): req = requests.Request() creds.refresh(req) print(creds.token) if __name__=="__main__": test_get_token()
Run the script to fetch the token.
Run the Speech-to-Text pre-trained API sample script
This example shows you how to interact with a Speech-to-Text pre-trained API.
Check whether there is a client library installed.
pip freeze | grep speech # output example: google-cloud-speech==2.15.0
If the existing version doesn't match the client library in
https://CONSOLE_ENDPOINT/.well-known/static/client-libraries
, uninstall the client library using the following command:pip uninstall google-cloud-speech
Specify the console endpoint and the client library for Speech-to-Text (provided in the example).
wget https://CONSOLE_ENDPOINT/.well-known/static/client-libraries/google-cloud-speech
Extract the
tar
file, and install it usingpip
. If errors are generated because something isn't found, install any missing dependencies.tar -xvzf CLIENT_LIBRARY pip install -r FOLDER/requirements.txt --no-index --find-links FOLDER
Use the Speech-to-Text client library script to generate the token, and make requests to the OCR service.
Set up your environment variable.
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""SERVICE_KEY".json"
Speech-to-Text sample
Replace the ENDPOINT
with the Speech-to-Text endpoint that you use for your
organization.
import base64
from google.cloud import speech_v1p1beta1
import google.auth
from google.auth.transport import requests
from google.api_core.client_options import ClientOptions
audience = "https://ENDPOINT:443"
api_endpoint="ENDPOINT:443"
def get_client(creds):
opts = ClientOptions(api_endpoint=api_endpoint)
return speech_v1p1beta1.SpeechClient(credentials=creds, client_options=opts)
def main():
creds = None
try:
creds, project_id = google.auth.default()
creds = creds.with_gdch_audience(audience)
req = requests.Request()
creds.refresh(req)
print("Got token: ")
print(creds.token)
except Exception as e:
print("Caught exception" + str(e))
raise e
return creds
def speech_func(creds):
tc = get_client(creds)
content="CONTENT"
audio = speech_v1p1beta1.RecognitionAudio()
audio.content = base64.standard_b64decode(content)
config = speech_v1p1beta1.RecognitionConfig()
config.encoding= speech_v1p1beta1.RecognitionConfig.AudioEncoding.LINEAR16
config.sample_rate_hertz=16000
config.language_code="en-US"
config.audio_channel_count=1
metadata = [("x-goog-user-project", "projects/PROJECT_ID")]
resp = tc.recognize(config=config, audio=audio, metadata=metadata)
print(resp)
if __name__=="__main__":
creds = main()
speech_func(creds)
What's next
- Learn more about how to Transcribe audio.