搭配字詞層級信心值功能轉錄 Cloud Storage 中的檔案 (Beta 版)
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
轉錄儲存在 Cloud Storage 中的音訊檔案,並傳回每個字詞的信賴度。
深入探索
如需包含這個程式碼範例的詳細說明文件,請參閱下列內容:
程式碼範例
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],[],[],[],null,["# Transcribe a file in Cloud Storage with word-level confidence (beta)\n\nTranscribe an audio file stored in Cloud Storage, returning the confidence level for each word.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Enable word-level confidence](/speech-to-text/docs/word-confidence)\n\nCode sample\n-----------\n\n### Java\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Java API\nreference documentation](/java/docs/reference/google-cloud-speech/latest/overview).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n /**\n * Transcribe a remote audio file with word level confidence\n *\n * @param gcsUri path to the remote audio file\n */\n public static void transcribeWordLevelConfidenceGcs(String gcsUri) throws Exception {\n try (SpeechClient speechClient = SpeechClient.create()) {\n\n // Configure request to enable word level confidence\n RecognitionConfig config =\n RecognitionConfig.newBuilder()\n .setEncoding(AudioEncoding.FLAC)\n .setSampleRateHertz(44100)\n .setLanguageCode(\"en-US\")\n .setEnableWordConfidence(true)\n .build();\n\n // Set the remote path for the audio file\n RecognitionAudio audio = RecognitionAudio.newBuilder().setUri(gcsUri).build();\n\n // Use non-blocking call for getting file transcription\n OperationFuture\u003cLongRunningRecognizeResponse, LongRunningRecognizeMetadata\u003e response =\n speechClient.longRunningRecognizeAsync(config, audio);\n\n while (!response.isDone()) {\n System.out.println(\"Waiting for response...\");\n Thread.sleep(10000);\n }\n // Just print the first result here.\n SpeechRecognitionResult result = response.get().getResultsList().get(0);\n\n // There can be several alternative transcripts for a given chunk of speech. Just use the\n // first (most likely) one here.\n SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);\n // Print out the result\n System.out.printf(\"Transcript : %s\\n\", alternative.getTranscript());\n System.out.format(\n \"First Word and Confidence : %s %s \\n\",\n alternative.getWords(0).getWord(), alternative.getWords(0).getConfidence());\n }\n }\n\n### Node.js\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Node.js API\nreference documentation](/nodejs/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n // Imports the Google Cloud client library\n const speech = require('https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html').v1p1beta1;\n\n // Creates a client\n const client = new speech.https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html();\n\n /**\n * TODO(developer): Uncomment the following line before running the sample.\n */\n // const uri = path to GCS audio file e.g. `gs:/bucket/audio.wav`;\n\n const config = {\n encoding: 'FLAC',\n sampleRateHertz: 16000,\n languageCode: 'en-US',\n enableWordConfidence: true,\n };\n\n const audio = {\n uri: gcsUri,\n };\n\n const request = {\n config: config,\n audio: audio,\n };\n\n const [response] = await client.recognize(request);\n const transcription = response.results\n .map(result =\u003e result.alternatives[0].transcript)\n .join('\\n');\n const confidence = response.results\n .map(result =\u003e result.alternatives[0].confidence)\n .join('\\n');\n console.log(`Transcription: ${transcription} \\n Confidence: ${confidence}`);\n\n console.log('Word-Level-Confidence:');\n const words = response.results.map(result =\u003e result.alternatives[0]);\n words[0].words.forEach(a =\u003e {\n console.log(` word: ${a.word}, confidence: ${a.confidence}`);\n });\n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud import speech_v1p1beta1 as speech\n\n\n def transcribe_file_with_word_level_confidence(audio_uri: str) -\u003e str:\n \"\"\"Transcribe a remote audio file with word level confidence.\n Args:\n audio_uri (str): The Cloud Storage URI of the input audio.\n E.g., gs://[BUCKET]/[FILE]\n Returns:\n The generated transcript from the audio file provided with word level confidence.\n \"\"\"\n\n client = speech.SpeechClient()\n\n # Configure request to enable word level confidence\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.FLAC,\n sample_rate_hertz=44100,\n language_code=\"en-US\",\n enable_word_confidence=True, # Enable word level confidence\n )\n\n # Set the remote path for the audio file\n audio = speech.RecognitionAudio(uri=audio_uri)\n\n # Use non-blocking call for getting file transcription\n response = client.long_running_recognize(config=config, audio=audio).result(\n timeout=300\n )\n\n transcript_builder = []\n for i, result in enumerate(response.results):\n alternative = result.alternatives[0]\n transcript_builder.append(\"-\" * 20)\n transcript_builder.append(f\"\\nFirst alternative of result {i}\")\n transcript_builder.append(f\"\\nTranscript: {alternative.transcript}\")\n transcript_builder.append(\n \"\\nFirst Word and Confidence: ({}, {})\".format(\n alternative.words[0].word, alternative.words[0].confidence\n )\n )\n\n transcript = \"\".join(transcript_builder)\n print(transcript)\n\n return transcript\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=speech)."]]