Transcribe a file in Cloud Storage with word-level confidence (beta)
Stay organized with collections
Save and categorize content based on your preferences.
Transcribe an audio file stored in Cloud Storage, returning the confidence level for each word.
Explore further
For detailed documentation that includes this code sample, see the following:
Code sample
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[],null,["# Transcribe a file in Cloud Storage with word-level confidence (beta)\n\nTranscribe an audio file stored in Cloud Storage, returning the confidence level for each word.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Enable word-level confidence](/speech-to-text/docs/word-confidence)\n\nCode sample\n-----------\n\n### Java\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Java API\nreference documentation](/java/docs/reference/google-cloud-speech/latest/overview).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n /**\n * Transcribe a remote audio file with word level confidence\n *\n * @param gcsUri path to the remote audio file\n */\n public static void transcribeWordLevelConfidenceGcs(String gcsUri) throws Exception {\n try (SpeechClient speechClient = SpeechClient.create()) {\n\n // Configure request to enable word level confidence\n RecognitionConfig config =\n RecognitionConfig.newBuilder()\n .setEncoding(AudioEncoding.FLAC)\n .setSampleRateHertz(44100)\n .setLanguageCode(\"en-US\")\n .setEnableWordConfidence(true)\n .build();\n\n // Set the remote path for the audio file\n RecognitionAudio audio = RecognitionAudio.newBuilder().setUri(gcsUri).build();\n\n // Use non-blocking call for getting file transcription\n OperationFuture\u003cLongRunningRecognizeResponse, LongRunningRecognizeMetadata\u003e response =\n speechClient.longRunningRecognizeAsync(config, audio);\n\n while (!response.isDone()) {\n System.out.println(\"Waiting for response...\");\n Thread.sleep(10000);\n }\n // Just print the first result here.\n SpeechRecognitionResult result = response.get().getResultsList().get(0);\n\n // There can be several alternative transcripts for a given chunk of speech. Just use the\n // first (most likely) one here.\n SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);\n // Print out the result\n System.out.printf(\"Transcript : %s\\n\", alternative.getTranscript());\n System.out.format(\n \"First Word and Confidence : %s %s \\n\",\n alternative.getWords(0).getWord(), alternative.getWords(0).getConfidence());\n }\n }\n\n### Node.js\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Node.js API\nreference documentation](/nodejs/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n // Imports the Google Cloud client library\n const speech = require('https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html').v1p1beta1;\n\n // Creates a client\n const client = new speech.https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html();\n\n /**\n * TODO(developer): Uncomment the following line before running the sample.\n */\n // const uri = path to GCS audio file e.g. `gs:/bucket/audio.wav`;\n\n const config = {\n encoding: 'FLAC',\n sampleRateHertz: 16000,\n languageCode: 'en-US',\n enableWordConfidence: true,\n };\n\n const audio = {\n uri: gcsUri,\n };\n\n const request = {\n config: config,\n audio: audio,\n };\n\n const [response] = await client.recognize(request);\n const transcription = response.results\n .map(result =\u003e result.alternatives[0].transcript)\n .join('\\n');\n const confidence = response.results\n .map(result =\u003e result.alternatives[0].confidence)\n .join('\\n');\n console.log(`Transcription: ${transcription} \\n Confidence: ${confidence}`);\n\n console.log('Word-Level-Confidence:');\n const words = response.results.map(result =\u003e result.alternatives[0]);\n words[0].words.forEach(a =\u003e {\n console.log(` word: ${a.word}, confidence: ${a.confidence}`);\n });\n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud import speech_v1p1beta1 as speech\n\n\n def transcribe_file_with_word_level_confidence(audio_uri: str) -\u003e str:\n \"\"\"Transcribe a remote audio file with word level confidence.\n Args:\n audio_uri (str): The Cloud Storage URI of the input audio.\n E.g., gs://[BUCKET]/[FILE]\n Returns:\n The generated transcript from the audio file provided with word level confidence.\n \"\"\"\n\n client = speech.SpeechClient()\n\n # Configure request to enable word level confidence\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.FLAC,\n sample_rate_hertz=44100,\n language_code=\"en-US\",\n enable_word_confidence=True, # Enable word level confidence\n )\n\n # Set the remote path for the audio file\n audio = speech.RecognitionAudio(uri=audio_uri)\n\n # Use non-blocking call for getting file transcription\n response = client.long_running_recognize(config=config, audio=audio).result(\n timeout=300\n )\n\n transcript_builder = []\n for i, result in enumerate(response.results):\n alternative = result.alternatives[0]\n transcript_builder.append(\"-\" * 20)\n transcript_builder.append(f\"\\nFirst alternative of result {i}\")\n transcript_builder.append(f\"\\nTranscript: {alternative.transcript}\")\n transcript_builder.append(\n \"\\nFirst Word and Confidence: ({}, {})\".format(\n alternative.words[0].word, alternative.words[0].confidence\n )\n )\n\n transcript = \"\".join(transcript_builder)\n print(transcript)\n\n return transcript\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=speech)."]]