This guide walks you through the process of running a Speech-to-Text test using Google's Vertex AI Speech service.
Before trying this sample, follow the Python setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Python API reference documentation.
Create a python file
speech-to-text-test.py
. Replace theimage_uri_to_test
value with the URI of a source image, as shown:from google.cloud import speech def transcribe_gcs_audio(gcs_uri: str) -> speech.RecognizeResponse: client = speech.SpeechClient() audio = speech.RecognitionAudio(uri=gcs_uri) config = speech.RecognitionConfig( encoding=speech.RecognitionConfig.AudioEncoding.FLAC, sample_rate_hertz=16000, language_code="en-US", # Specify the language code (e.g., "en-US" for US English) # You can add more features here, e.g.: # enable_automatic_punctuation=True, # model="default" # or "latest_long", "phone_call", "video", "chirp" (v2 API) ) # Performs synchronous speech recognition on the audio file response = client.recognize(config=config, audio=audio) # Print the transcription for result in response.results: print(f"Transcript: {result.alternatives[0].transcript}") if result.alternatives[0].confidence: print(f"Confidence: {result.alternatives[0].confidence:.2f}") return response if __name__ == "__main__": # Replace with the URI of your audio file in Google Cloud Storage audio_file_uri = "AUDIO_FILE_URI" print(f"Transcribing audio from: {audio_file_uri}") transcribe_gcs_audio(audio_file_uri)
Replace the following:
AUDIO_FILE_URI
: the URI of an audio file "gs://your-bucket/your-image.png
"
Create a Dockerfile:
ROM python:3.9-slim WORKDIR /app COPY speech-to-text-test.py /app/ # Install 'requests' for HTTP calls RUN pip install --no-cache-dir requests CMD ["python", "speech-to-text-test.py"]
Build the Docker image for the Speech-to-Text application:
docker build -t speech-to-text-app .
Follow instructions at Configure Docker to:
- Configure Docker,
- Create a secret, and
- Upload the image to HaaS.
Sign in to the user cluster and generate its kubeconfig file with a user identity. Make sure that you set the kubeconfig path as an environment variable:
export KUBECONFIG=${CLUSTER_KUBECONFIG_PATH}
Create a Kubernetes secret by running the following command in your terminal, pasting your API key:
kubectl create secret generic gcp-api-key-secret \ --from-literal=GCP_API_KEY='PASTE_YOUR_API_KEY_HERE'
This command creates a secret named
gcp-api-key-secret
with a keyGCP_API_KEY
.Apply the Kubernetes manifest:
apiVersion: batch/v1 kind: Job metadata: name: speech-to-text-test-job spec: template: spec: containers: - name: speech-to-text-test-container image: HARBOR_INSTANCE_URL/HARBOR_PROJECT/speech-to-text-app:latest # Your image path # Mount the API key from the secret into the container # as an environment variable named GCP_API_KEY. imagePullSecrets: - name: SECRET envFrom: - secretRef: name: gcp-api-key-secret restartPolicy: Never backoffLimit: 4
Replace the following:
HARBOR_INSTANCE_URL
: the Harbor instance URL.HARBOR_PROJECT
: the Harbor project.SECRET
: the name of the secret created to store docker credentials.
Check the job status:
kubectl get jobs/speech-to-text-test-job # It will show 0/1 completions, then 1/1 after it succeeds
After the job has completed, you can view the output in the pod logs:
kubectl logs -l job-name=speech-to-text-test-job