转录本地多渠道文件(Beta 版)
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
转录包含多个通道的本地音频文件。
深入探索
如需查看包含此代码示例的详细文档,请参阅以下内容:
代码示例
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],[],[],[],null,["# Transcribe a local multi-channel file (beta)\n\nTranscribe a local audio file that includes more than one channel.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Transcribe audio with multiple channels](/speech-to-text/docs/multi-channel)\n\nCode sample\n-----------\n\n### Go\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Go API\nreference documentation](/go/docs/reference/cloud.google.com/go/speech/latest/apiv1).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n // transcribeMultichannel generates a transcript from a multichannel speech file and tags the speech from each channel.\n func transcribeMultichannel(w io.Writer) error {\n \tctx := context.Background()\n\n \tclient, err := speech.NewClient(ctx)\n \tif err != nil {\n \t\treturn fmt.Errorf(\"NewClient: %w\", err)\n \t}\n \tdefer client.Close()\n\n \tdata, err := os.ReadFile(\"../testdata/commercial_stereo.wav\")\n \tif err != nil {\n \t\treturn fmt.Errorf(\"ReadFile: %w\", err)\n \t}\n\n \tresp, err := client.Recognize(ctx, &speechpb.RecognizeRequest{\n \t\tConfig: &speechpb.RecognitionConfig{\n \t\t\tEncoding: speechpb.RecognitionConfig_LINEAR16,\n \t\t\tSampleRateHertz: 44100,\n \t\t\tLanguageCode: \"en-US\",\n \t\t\tAudioChannelCount: 2,\n \t\t\tEnableSeparateRecognitionPerChannel: true,\n \t\t},\n \t\tAudio: &speechpb.RecognitionAudio{\n \t\t\tAudioSource: &speechpb.RecognitionAudio_Content{Content: data},\n \t\t},\n \t})\n \tif err != nil {\n \t\treturn fmt.Errorf(\"Recognize: %w\", err)\n \t}\n\n \t// Print the results.\n \tfor _, result := range resp.Results {\n \t\tfor _, alt := range result.Alternatives {\n \t\t\tfmt.Fprintf(w, \"Channel %v: %v\\n\", result.ChannelTag, alt.Transcript)\n \t\t}\n \t}\n \treturn nil\n }\n\n### Java\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Java API\nreference documentation](/java/docs/reference/google-cloud-speech/latest/overview).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n /**\n * Transcribe a local audio file with multi-channel recognition\n *\n * @param fileName the path to local audio file\n */\n public static void transcribeMultiChannel(String fileName) throws Exception {\n Path path = Paths.get(fileName);\n byte[] content = Files.readAllBytes(path);\n\n try (SpeechClient speechClient = SpeechClient.create()) {\n // Get the contents of the local audio file\n RecognitionAudio recognitionAudio =\n RecognitionAudio.newBuilder().setContent(ByteString.copyFrom(content)).build();\n\n // Configure request to enable multiple channels\n RecognitionConfig config =\n RecognitionConfig.newBuilder()\n .setEncoding(AudioEncoding.LINEAR16)\n .setLanguageCode(\"en-US\")\n .setSampleRateHertz(44100)\n .setAudioChannelCount(2)\n .setEnableSeparateRecognitionPerChannel(true)\n .build();\n\n // Perform the transcription request\n RecognizeResponse recognizeResponse = speechClient.recognize(config, recognitionAudio);\n\n // Print out the results\n for (SpeechRecognitionResult result : recognizeResponse.getResultsList()) {\n // There can be several alternative transcripts for a given chunk of speech. Just use the\n // first (most likely) one here.\n SpeechRecognitionAlternative alternative = result.getAlternatives(0);\n System.out.format(\"Transcript : %s\\n\", alternative.getTranscript());\n System.out.printf(\"Channel Tag : %s\\n\\n\", result.getChannelTag());\n }\n }\n }\n\n### Node.js\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Node.js API\nreference documentation](/nodejs/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n const fs = require('fs');\n\n // Imports the Google Cloud client library\n const speech = require('https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html').v1p1beta1;\n\n // Creates a client\n const client = new speech.https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html();\n\n /**\n * TODO(developer): Uncomment the following lines before running the sample.\n */\n // const fileName = 'Local path to audio file, e.g. /path/to/audio.raw';\n\n const config = {\n encoding: 'LINEAR16',\n languageCode: 'en-US',\n audioChannelCount: 2,\n enableSeparateRecognitionPerChannel: true,\n };\n\n const audio = {\n content: fs.readFileSync(fileName).toString('base64'),\n };\n\n const request = {\n config: config,\n audio: audio,\n };\n\n const [response] = await client.recognize(request);\n const transcription = response.results\n .map(\n result =\u003e\n ` Channel Tag: ${result.channelTag} ${result.alternatives[0].transcript}`\n )\n .join('\\n');\n console.log(`Transcription: \\n${transcription}`);\n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n from google.cloud import speech_v1p1beta1 as speech\n\n client = speech.SpeechClient()\n\n speech_file = \"resources/Google_Gnome.wav\"\n\n with open(speech_file, \"rb\") as audio_file:\n content = audio_file.read()\n\n audio = speech.RecognitionAudio(content=content)\n\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=16000,\n language_code=\"en-US\",\n audio_channel_count=1,\n enable_separate_recognition_per_channel=True,\n )\n\n response = client.recognize(config=config, audio=audio)\n\n for i, result in enumerate(response.results):\n alternative = result.alternatives[0]\n print(\"-\" * 20)\n print(f\"First alternative of result {i}\")\n print(f\"Transcript: {alternative.transcript}\")\n print(f\"Channel Tag: {result.channel_tag}\")\n\n return response.results\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=speech)."]]