Transcribe a local multi-channel file
Stay organized with collections
Save and categorize content based on your preferences.
Transcribe a local audio file that includes more than one channel.
Explore further
For detailed documentation that includes this code sample, see the following:
Code sample
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[],null,["Transcribe a local audio file that includes more than one channel.\n\nExplore further\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Transcribe audio with multiple channels](/speech-to-text/docs/multi-channel)\n\nCode sample \n\nJava\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Java API\nreference documentation](/java/docs/reference/google-cloud-speech/latest/overview).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n /**\n * Transcribe a local audio file with multi-channel recognition\n *\n * @param fileName the path to local audio file\n */\n public static void transcribeMultiChannel(String fileName) throws Exception {\n Path path = Paths.get(fileName);\n byte[] content = Files.readAllBytes(path);\n\n try (SpeechClient speechClient = SpeechClient.create()) {\n // Get the contents of the local audio file\n RecognitionAudio recognitionAudio =\n RecognitionAudio.newBuilder().setContent(ByteString.copyFrom(content)).build();\n\n // Configure request to enable multiple channels\n RecognitionConfig config =\n RecognitionConfig.newBuilder()\n .setEncoding(AudioEncoding.LINEAR16)\n .setLanguageCode(\"en-US\")\n .setSampleRateHertz(44100)\n .setAudioChannelCount(2)\n .setEnableSeparateRecognitionPerChannel(true)\n .build();\n\n // Perform the transcription request\n RecognizeResponse recognizeResponse = speechClient.recognize(config, recognitionAudio);\n\n // Print out the results\n for (SpeechRecognitionResult result : recognizeResponse.getResultsList()) {\n // There can be several alternative transcripts for a given chunk of speech. Just use the\n // first (most likely) one here.\n SpeechRecognitionAlternative alternative = result.getAlternatives(0);\n System.out.format(\"Transcript : %s\\n\", alternative.getTranscript());\n System.out.printf(\"Channel Tag : %s\\n\", result.getChannelTag());\n }\n }\n }\n\nNode.js\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Node.js API\nreference documentation](/nodejs/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n const fs = require('fs');\n\n // Imports the Google Cloud client library\n const speech = require('https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html').v1;\n\n // Creates a client\n const client = new speech.https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html();\n\n /**\n * TODO(developer): Uncomment the following lines before running the sample.\n */\n // const fileName = 'Local path to audio file, e.g. /path/to/audio.raw';\n\n const config = {\n encoding: 'LINEAR16',\n languageCode: 'en-US',\n audioChannelCount: 2,\n enableSeparateRecognitionPerChannel: true,\n };\n\n const audio = {\n content: fs.readFileSync(fileName).toString('base64'),\n };\n\n const request = {\n config: config,\n audio: audio,\n };\n\n const [response] = await client.recognize(request);\n const transcription = response.results\n .map(\n result =\u003e\n ` Channel Tag: ${result.channelTag} ${result.alternatives[0].transcript}`\n )\n .join('\\n');\n console.log(`Transcription: \\n${transcription}`);\n\nPython\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud import speech\n\n\n def transcribe_file_with_multichannel(audio_file: str) -\u003e speech.RecognizeResponse:\n \"\"\"Transcribe the given audio file synchronously with multi channel.\n Args:\n audio_file (str): Path to the local audio file to be transcribed.\n Example: \"resources/multi.wav\"\n Returns:\n cloud_speech.RecognizeResponse: The full response object which includes the transcription results.\n \"\"\"\n client = speech.SpeechClient()\n\n with open(audio_file, \"rb\") as f:\n audio_content = f.read()\n\n audio = speech.RecognitionAudio(content=audio_content)\n\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=44100,\n language_code=\"en-US\",\n audio_channel_count=2,\n enable_separate_recognition_per_channel=True,\n )\n\n response = client.recognize(config=config, audio=audio)\n\n for i, result in enumerate(response.results):\n alternative = result.alternatives[0]\n print(\"-\" * 20)\n print(f\"First alternative of result {i}\")\n print(f\"Transcript: {alternative.transcript}\")\n print(f\"Channel Tag: {result.channel_tag}\")\n\n return result\n\nRuby\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n # audio_file_path = \"path/to/audio.wav\"\n\n require \"google/cloud/speech\"\n\n speech = Google::Cloud::https://cloud.google.com/ruby/docs/reference/google-cloud-speech-v1/latest/Google-Cloud-Speech.html.https://cloud.google.com/ruby/docs/reference/google-cloud-speech/latest/Google-Cloud-Speech.html version: :v1\n\n config = {\n encoding: :LINEAR16,\n sample_rate_hertz: 44_100,\n language_code: \"en-US\",\n audio_channel_count: 2,\n enable_separate_recognition_per_channel: true\n }\n\n audio_file = File.binread audio_file_path\n audio = { content: audio_file }\n\n response = speech.recognize config: config, audio: audio\n\n results = response.results\n\n results.each_with_index do |result, i|\n alternative = result.alternatives.first\n puts \"-\" * 20\n puts \"First alternative of result #{i}\"\n puts \"Transcript: #{alternative.transcript}\"\n puts \"Channel Tag: #{result.channel_tag}\"\n end\n\nWhat's next\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=speech)."]]