This page describes how to convert audio from a binary file to base64-encoded data.
When passing audio to the Speech API, you can either pass the URI of a file
located on Google Cloud Storage, or you can embed audio data directly within
the request's content
field.
Embedding base64 encoded audio
Audio data is binary data. Within a gRPC request, you can simply write the binary data out directly; however, JSON is used when making a REST request. JSON is a text format that does not directly support binary data, so you will need to convert such binary data into text using Base64 encoding.
To base64 encode an audio file:
Linux
- Encode the audio file using the base64 command line tool, making sure to
prevent line-wrapping by using the
-w 0
flag:
$ base64 source_audio_file -w 0 > dest_audio_file
2. Create a JSON request file, inlining the base64-encoded audio within the
request's content
field:
{ "config": { "encoding":"FLAC", "sampleRateHertz":16000, "languageCode":"en-US" }, "audio": { "content": "ZkxhQwAAACIQABAAAAUJABtAA+gA8AB+W8FZndQvQAyjv..." } }
Mac OSX
- Encode the audio file using the base64 command line tool:
$ base64 source_audio_file > dest_audio_file
2. Create a JSON request file, inlining the base64-encoded audio within the
request's content
field:
{ "config": { "encoding":"FLAC", "sampleRateHertz":16000, "languageCode":"en-US" }, "audio": { "content": "ZkxhQwAAACIQABAAAAUJABtAA+gA8AB+W8FZndQvQAyjv..." } }
Windows
- Encode the audio file using the Base64.exe tool:
C:> Base64.exe -e source_audio_file > dest_audio_file
2. Create a JSON request file, inlining the base64-encoded audio within the
request's content
field:
{ "config": { "encoding":"FLAC", "sampleRateHertz":16000, "languageCode":"en-US" }, "audio": { "content": "ZkxhQwAAACIQABAAAAUJABtAA+gA8AB+W8FZndQvQAyjv..." } }
Embedding audio content programmatically
Embedding audio binary data into requests through text editors is neither desirable or practical. In practice, you will be embedding base64 encoded files within client code. All supported programming languages have built-in mechanisms for base64-encoding content:
Python
In Python, base64 encode audio files as follows:
# Import the base64 encoding library.
import base64
# Pass the audio data to an encoding function.
def encode_audio(audio):
audio_content = audio.read()
return base64.b64encode(audio_content)
Node.js
In Node.js, base64 encode audio files as follows, where audioFile
is the binary-encoded audio data:
Java
In Java, use the encodeBase64
static method within
org.apache.commons.codec.binary.Base64
to base64 encode binary files:
// Import the Base64 encoding library.
import org.apache.commons.codec.binary.Base64;
// Encode the speech.
byte[] encodedAudio = Base64.encodeBase64(audio.getBytes());