本指南會逐步說明如何使用 Google 的 Vertex AI Speech 服務執行 Speech-to-Text 測試。
在嘗試這個範例之前,請先按照這篇 Vertex AI 快速入門導覽課程中的 Python 設定操作說明進行。詳情請參閱 Vertex AI Python API 參考說明文件。
建立 Python 檔案
speech-to-text-test.py。將image_uri_to_test值替換為來源圖片的 URI,如下所示:from google.cloud import speech def transcribe_gcs_audio(gcs_uri: str) -> speech.RecognizeResponse: client = speech.SpeechClient() audio = speech.RecognitionAudio(uri=gcs_uri) config = speech.RecognitionConfig( encoding=speech.RecognitionConfig.AudioEncoding.FLAC, sample_rate_hertz=16000, language_code="en-US", # Specify the language code (e.g., "en-US" for US English) # You can add more features here, e.g.: # enable_automatic_punctuation=True, # model="default" # or "latest_long", "phone_call", "video", "chirp" (v2 API) ) # Performs synchronous speech recognition on the audio file response = client.recognize(config=config, audio=audio) # Print the transcription for result in response.results: print(f"Transcript: {result.alternatives[0].transcript}") if result.alternatives[0].confidence: print(f"Confidence: {result.alternatives[0].confidence:.2f}") return response if __name__ == "__main__": # Replace with the URI of your audio file in Google Cloud Storage audio_file_uri = "AUDIO_FILE_URI" print(f"Transcribing audio from: {audio_file_uri}") transcribe_gcs_audio(audio_file_uri)更改下列內容:
AUDIO_FILE_URI:音訊檔案的 URI "gs://your-bucket/your-image.png"
建立 Dockerfile:
ROM python:3.9-slim WORKDIR /app COPY speech-to-text-test.py /app/ # Install 'requests' for HTTP calls RUN pip install --no-cache-dir requests CMD ["python", "speech-to-text-test.py"]為語音轉文字應用程式建構 Docker 映像檔:
docker build -t speech-to-text-app .請按照「設定 Docker」中的操作說明執行下列操作:
- 設定 Docker,
- 建立密鑰,並
- 將圖片上傳至 HaaS。
登入使用者叢集,並使用使用者身分產生 kubeconfig 檔案。請務必將 kubeconfig 路徑設為環境變數:
export KUBECONFIG=${CLUSTER_KUBECONFIG_PATH}在終端機中執行下列指令,並貼上 API 金鑰,即可建立 Kubernetes 密鑰:
kubectl create secret generic gcp-api-key-secret \ --from-literal=GCP_API_KEY='PASTE_YOUR_API_KEY_HERE'這個指令會建立名為
gcp-api-key-secret的密鑰,並使用GCP_API_KEY金鑰。套用 Kubernetes 資訊清單:
apiVersion: batch/v1 kind: Job metadata: name: speech-to-text-test-job spec: template: spec: containers: - name: speech-to-text-test-container image: HARBOR_INSTANCE_URL/HARBOR_PROJECT/speech-to-text-app:latest # Your image path # Mount the API key from the secret into the container # as an environment variable named GCP_API_KEY. imagePullSecrets: - name: SECRET envFrom: - secretRef: name: gcp-api-key-secret restartPolicy: Never backoffLimit: 4更改下列內容:
HARBOR_INSTANCE_URL:Harbor 執行個體網址。HARBOR_PROJECT:Harbor 專案。SECRET:為儲存 Docker 憑證而建立的密鑰名稱。
檢查工作狀態:
kubectl get jobs/speech-to-text-test-job # It will show 0/1 completions, then 1/1 after it succeeds工作完成後,您可以在 Pod 記錄中查看輸出內容:
kubectl logs -l job-name=speech-to-text-test-job