Export Speech-to-Text transcript to Cloud Storage (Beta)
Stay organized with collections
Save and categorize content based on your preferences.
This sample demonstrates how to export a speech-to-text transcript to a Cloud Storage bucket.
Code sample
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[],null,["# Export Speech-to-Text transcript to Cloud Storage (Beta)\n\nThis sample demonstrates how to export a speech-to-text transcript to a Cloud Storage bucket.\n\nCode sample\n-----------\n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud import speech\n from google.cloud import https://cloud.google.com/python/docs/reference/storage/latest/\n from google.cloud.speech_v1 import types\n\n\n def export_transcript_to_storage_beta(\n audio_uri: str,\n output_bucket_name: str,\n output_filename: str,\n ) -\u003e types.LongRunningRecognizeResponse:\n \"\"\"Transcribes an audio file from Cloud Storage and exports the transcript to Cloud Storage bucket.\n Args:\n audio_uri (str): The Cloud Storage URI of the input audio, e.g., gs://[BUCKET]/[FILE]\n output_bucket_name (str): Name of the Cloud Storage bucket to store the output transcript.\n output_filename (str): Name of the output file to store the transcript.\n Returns:\n types.LongRunningRecognizeResponse: The response containing the transcription results.\n \"\"\"\n\n audio = speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.RecognitionAudio.html(uri=audio_uri)\n output_storage_uri = f\"gs://{output_bucket_name}/{output_filename}\"\n\n # Pass in the URI of the Cloud Storage bucket to hold the transcription\n output_config = speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.TranscriptOutputConfig.html(gcs_uri=output_storage_uri)\n\n # Speech configuration object\n config = speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.RecognitionConfig.html(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=8000,\n language_code=\"en-US\",\n )\n\n # Compose the long-running request\n request = speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.LongRunningRecognizeRequest.html(\n audio=audio, config=config, output_config=output_config\n )\n\n # Create the speech client\n speech_client = speech.SpeechClient()\n # Create the storage client\n storage_client = https://cloud.google.com/python/docs/reference/storage/latest/.https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.client.Client.html()\n\n # Run the recognizer to export transcript\n operation = speech_client.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.services.speech.SpeechClient.html#google_cloud_speech_v1_services_speech_SpeechClient_long_running_recognize(request=request)\n print(\"Waiting for operation to complete...\")\n operation.result(timeout=90)\n\n # Get bucket with name\n bucket = storage_client.https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.client.Client.html#google_cloud_storage_client_Client_get_bucket(output_bucket_name)\n # Get blob (file) from bucket\n blob = bucket.https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.bucket.Bucket.html#google_cloud_storage_bucket_Bucket_get_blob(output_filename)\n\n # Get content as bytes\n results_bytes = blob.https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.blob.Blob.html#google_cloud_storage_blob_Blob_download_as_bytes()\n # Get transcript exported in storage bucket\n storage_transcript = types.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.LongRunningRecognizeResponse.html.from_json(\n results_bytes, ignore_unknown_fields=True\n )\n\n # Each result is for a consecutive portion of the audio. Iterate through\n # them to get the transcripts for the entire audio file.\n for result in storage_transcript.results:\n # The first alternative is the most likely one for this portion.\n print(f\"Transcript: {result.alternatives[0].transcript}\")\n print(f\"Confidence: {result.alternatives[0].confidence}\")\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=speech)."]]