La API de Media Translation está obsoleta y ya no estará disponible en Google Cloud después del 1 de julio de 2024. Puedes replicar la funcionalidad de la API de Media Translation mediante una combinación de otros servicios de Google Cloud, como Cloud Speech-to-Text y la API de Cloud Translation.

Traduce audio de transmisión a texto

La traducción de medios traduce un archivo de audio o una transmisión de voz a texto de otro idioma. En esta página, se proporcionan muestras de código en las que se demuestra cómo traducir audio de transmisión a texto mediante las bibliotecas cliente de la traducción de medios.

Configura tu proyecto

Antes de poder usar la traducción de medios, debes configurar un proyecto de Google Cloud y habilitar la API de traducción de medios para ese proyecto.

Accede a tu cuenta de Google Cloud. Si eres nuevo en Google Cloud, crea una cuenta para evaluar el rendimiento de nuestros productos en situaciones reales. Los clientes nuevos también obtienen $300 en créditos gratuitos para ejecutar, probar y, además, implementar cargas de trabajo.

En la página del selector de proyectos de la consola de Google Cloud, selecciona o crea un proyecto de Google Cloud.

Ir al selector de proyectos

Asegúrate de que la facturación esté habilitada para tu proyecto de Google Cloud.

Habilita la API de Media Translation.

Habilita la API

Crear una cuenta de servicio:

En la consola de Google Cloud, ve a la página Crear cuenta de servicio.
Ve a Crear cuenta de servicio
Elige tu proyecto.
Ingresa un nombre en el campo Nombre de cuenta de servicio. La consola de Google Cloud completa el campo ID de cuenta de servicio en función de este nombre.

Opcional: en el campo Descripción de la cuenta de servicio, ingresa una descripción. Por ejemplo, Service account for quickstart.
Haz clic en Crear y continuar.
Otorga el rol Project > Owner a la cuenta de servicio.

Para otorgar el rol, busca la lista Seleccionar un rol y, luego, selecciona Project > Owner.

Nota: El campo Rol afecta a qué recursos puede acceder tu cuenta de servicio en el proyecto. Puedes revocar estos roles o asignar otros más adelante. En entornos de producción, no otorgues los roles de propietario, editor o visualizador. En su lugar, otorga un rol predefinido o un rol personalizado que satisfaga tus necesidades.
Haga clic en Continuar.
Haz clic en Listo para terminar de crear la cuenta de servicio.

No cierres la ventana del navegador. La usarás en la próxima tarea.

Haz lo siguiente para crear una clave de cuenta de servicio:

En la consola de Google Cloud, haz clic en la dirección de correo electrónico de la cuenta de servicio que creaste.
Haga clic en Claves.
Haz clic en Agregar clave y, luego, en Crear clave nueva.
Haga clic en Crear. Se descargará un archivo de claves JSON en tu computadora.
Haga clic en Cerrar.

Configura la variable de entorno GOOGLE_APPLICATION_CREDENTIALS en la ruta del archivo JSON que contiene tus credenciales. Esta variable solo se aplica a la sesión actual de Cloud Shell. Por lo tanto, si abres una sesión nueva, deberás volver a configurar la variable.

Ejemplo: Linux o macOS

export GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"

Reemplaza KEY_PATH por la ruta del archivo JSON que contiene tus credenciales.

Por ejemplo:

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/service-account-file.json"

Ejemplo: Windows

Para PowerShell:

$env:GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"

Reemplaza KEY_PATH por la ruta del archivo JSON que contiene tus credenciales.

Por ejemplo:

$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\service-account-file.json"

Para el símbolo del sistema:

set GOOGLE_APPLICATION_CREDENTIALS=KEY_PATH

Reemplaza KEY_PATH por la ruta del archivo JSON que contiene tus credenciales.

Instala Google Cloud CLI.

Para inicializar la CLI de gcloud, ejecuta el siguiente comando:

gcloud init

En la página del selector de proyectos de la consola de Google Cloud, selecciona o crea un proyecto de Google Cloud.

Ir al selector de proyectos

Asegúrate de que la facturación esté habilitada para tu proyecto de Google Cloud.

Habilita la API de Media Translation.

Habilita la API

Crear una cuenta de servicio:

En la consola de Google Cloud, ve a la página Crear cuenta de servicio.
Ve a Crear cuenta de servicio
Elige tu proyecto.
Ingresa un nombre en el campo Nombre de cuenta de servicio. La consola de Google Cloud completa el campo ID de cuenta de servicio en función de este nombre.

Opcional: en el campo Descripción de la cuenta de servicio, ingresa una descripción. Por ejemplo, Service account for quickstart.
Haz clic en Crear y continuar.
Otorga el rol Project > Owner a la cuenta de servicio.

Para otorgar el rol, busca la lista Seleccionar un rol y, luego, selecciona Project > Owner.

Nota: El campo Rol afecta a qué recursos puede acceder tu cuenta de servicio en el proyecto. Puedes revocar estos roles o asignar otros más adelante. En entornos de producción, no otorgues los roles de propietario, editor o visualizador. En su lugar, otorga un rol predefinido o un rol personalizado que satisfaga tus necesidades.
Haga clic en Continuar.
Haz clic en Listo para terminar de crear la cuenta de servicio.

No cierres la ventana del navegador. La usarás en la próxima tarea.

Haz lo siguiente para crear una clave de cuenta de servicio:

En la consola de Google Cloud, haz clic en la dirección de correo electrónico de la cuenta de servicio que creaste.
Haga clic en Claves.
Haz clic en Agregar clave y, luego, en Crear clave nueva.
Haga clic en Crear. Se descargará un archivo de claves JSON en tu computadora.
Haga clic en Cerrar.

Ejemplo: Linux o macOS

export GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"

Reemplaza KEY_PATH por la ruta del archivo JSON que contiene tus credenciales.

Por ejemplo:

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/service-account-file.json"

Ejemplo: Windows

Para PowerShell:

$env:GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"

Reemplaza KEY_PATH por la ruta del archivo JSON que contiene tus credenciales.

Por ejemplo:

$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\service-account-file.json"

Para el símbolo del sistema:

set GOOGLE_APPLICATION_CREDENTIALS=KEY_PATH

Reemplaza KEY_PATH por la ruta del archivo JSON que contiene tus credenciales.

Instala Google Cloud CLI.

Para inicializar la CLI de gcloud, ejecuta el siguiente comando:

gcloud init

Instala la biblioteca cliente para tu lenguaje de preferencia.

Traduce la voz

En las siguientes muestras de código, se demuestra cómo traducir la voz de un archivo que contiene hasta cinco minutos de audio o de un micrófono en vivo. Consulta las prácticas recomendadas para obtener recomendaciones sobre cómo proporcionar datos de voz a fin de obtener la mayor exactitud en el reconocimiento.

Los pasos principales son los mismos independientemente de la fuente de audio:

Inicializa un cliente SpeechTranslationServiceClient a fin de usarlo para enviar solicitudes a la traducción de medios.

Puedes volver a usar el mismo cliente para varias solicitudes.
Crea un objeto de solicitud StreamingTranslateSpeechConfig que especifique cómo procesar el audio.

El objeto StreamingTranslateSpeechConfig consta de un objeto TranslateSpeechConfig que proporciona información sobre el archivo de origen de audio y una propiedad single_utterance que especifica si la traducción de medios continúa traduciendo cuando el emisor se detiene.

El objeto TranslateSpeechConfig proporciona especificaciones técnicas de la fuente de audio (como la codificación y la tasa de muestreo), establece los idiomas de origen y de destino de la traducción (mediante sus códigos de idioma BCP-47) y define qué modelo de traducción se usa en la traducción de medios para la transcripción.

Nota: Especificar un modelo de traducción es opcional. En la traducción de medios, se usa un modelo básico predeterminado para la mayoría de las solicitudes. Para algunos pares de idiomas, hay modelos mejorados disponibles que están optimizados para las llamadas telefónicas o los videos. Consulta la página Compatibilidad de idiomas para conocer los modelos disponibles.
Envía una secuencia de objetos de solicitud StreamingTranslateSpeechRequest.

Debes enviar una secuencia de solicitudes para cada archivo de audio que deseas traducir. La primera solicitud proporciona el objeto StreamingTranslateSpeechConfig para la solicitud y las siguientes solicitudes proporcionan el contenido de audio en la transmisión.
Recibe el objeto de respuesta StreamingTranslateSpeechResult.

Mientras se recibe cualquier respuesta con un valor text_translation_result.is_final de false, el último resultado traducido reemplaza el resultado anterior.

Cuando la traducción de medios tiene un resultado final, el campo text_translation_result.is_final se establece en true y cualquier resultado de traducción que se reciba después se agrega al resultado anterior. (En esta instancia, el resultado anterior no se reemplaza). Puedes generar la traducción completa y comenzar con una sección nueva para la siguiente parte de la transcripción y el audio correspondiente.

Cuando se detenga el interlocutor, si el campo single_utterance se configura como verdadero en el objeto de solicitud StreamingTranslateSpeechConfig, la traducción de medios mostrará un evento END_OF_SINGLE_UTTERANCE para speech_event_type en la respuesta. El cliente dejará de enviar solicitudes, pero seguirá recibiendo respuestas hasta que finalice la traducción.
La transmisión tiene un límite de 5 minutos. Si superas este límite, se mostrará el error OUT_OF_RANGE.

Muestras de código

Traduce la voz de un archivo de audio

Java

Para obtener información sobre cómo instalar y usar la biblioteca cliente de Media Translation, consulta las bibliotecas cliente de Media Translation. Si deseas obtener más información, consulta la documentación de referencia de la API de Media Translation Java.

Para autenticarte en Media Translation, configura las credenciales predeterminadas de la aplicación. Si deseas obtener más información, consulta Configura la autenticación para un entorno de desarrollo local.

Ver en GitHub


import com.google.api.gax.rpc.BidiStream;
import com.google.cloud.mediatranslation.v1beta1.SpeechTranslationServiceClient;
import com.google.cloud.mediatranslation.v1beta1.StreamingTranslateSpeechConfig;
import com.google.cloud.mediatranslation.v1beta1.StreamingTranslateSpeechRequest;
import com.google.cloud.mediatranslation.v1beta1.StreamingTranslateSpeechResponse;
import com.google.cloud.mediatranslation.v1beta1.StreamingTranslateSpeechResult;
import com.google.cloud.mediatranslation.v1beta1.TranslateSpeechConfig;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class TranslateFromFile {

  public static void translateFromFile() throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String filePath = "path/to/audio.raw";
    translateFromFile(filePath);
  }

  public static void translateFromFile(String filePath) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (SpeechTranslationServiceClient client = SpeechTranslationServiceClient.create()) {
      Path path = Paths.get(filePath);
      byte[] content = Files.readAllBytes(path);
      ByteString audioContent = ByteString.copyFrom(content);

      TranslateSpeechConfig audioConfig =
          TranslateSpeechConfig.newBuilder()
              .setAudioEncoding("linear16")
              .setSampleRateHertz(16000)
              .setSourceLanguageCode("en-US")
              .setTargetLanguageCode("fr-FR")
              .build();

      StreamingTranslateSpeechConfig config =
          StreamingTranslateSpeechConfig.newBuilder()
              .setAudioConfig(audioConfig)
              .setSingleUtterance(true)
              .build();

      BidiStream<StreamingTranslateSpeechRequest, StreamingTranslateSpeechResponse> bidiStream =
          client.streamingTranslateSpeechCallable().call();

      // The first request contains the configuration.
      StreamingTranslateSpeechRequest requestConfig =
          StreamingTranslateSpeechRequest.newBuilder().setStreamingConfig(config).build();

      // The second request contains the audio
      StreamingTranslateSpeechRequest request =
          StreamingTranslateSpeechRequest.newBuilder().setAudioContent(audioContent).build();

      bidiStream.send(requestConfig);
      bidiStream.send(request);

      for (StreamingTranslateSpeechResponse response : bidiStream) {
        // Once the transcription settles, the response contains the
        // is_final result. The other results will be for subsequent portions of
        // the audio.
        StreamingTranslateSpeechResult res = response.getResult();
        String translation = res.getTextTranslationResult().getTranslation();

        if (res.getTextTranslationResult().getIsFinal()) {
          System.out.println(String.format("\nFinal translation: %s", translation));
          break;
        }
        System.out.println(String.format("\nPartial translation: %s", translation));
      }
    }
  }
}

Node.js

Ver en GitHub

const fs = require('fs');

// Imports the CLoud Media Translation client library
const {
  SpeechTranslationServiceClient,
} = require('@google-cloud/media-translation');

// Creates a client
const client = new SpeechTranslationServiceClient();

async function translate_from_file() {
  /**
   * TODO(developer): Uncomment the following lines before running the sample.
   */
  // const filename = 'Local path to audio file, e.g. /path/to/audio.raw';
  // const encoding = 'Encoding of the audio file, e.g. LINEAR16';
  // const sourceLanguage = 'BCP-47 source language code, e.g. en-US';
  // const targetLanguage = 'BCP-47 target language code, e.g. es-ES';

  const config = {
    audioConfig: {
      audioEncoding: encoding,
      sourceLanguageCode: sourceLanguage,
      targetLanguageCode: targetLanguage,
    },
    single_utterance: true,
  };

  // First request needs to have only a streaming config, no data.
  const initialRequest = {
    streamingConfig: config,
    audioContent: null,
  };

  const readStream = fs.createReadStream(filename, {
    highWaterMark: 4096,
    encoding: 'base64',
  });

  const chunks = [];
  readStream
    .on('data', chunk => {
      const request = {
        streamingConfig: config,
        audioContent: chunk.toString(),
      };
      chunks.push(request);
    })
    .on('close', () => {
      // Config-only request should be first in stream of requests
      stream.write(initialRequest);
      for (let i = 0; i < chunks.length; i++) {
        stream.write(chunks[i]);
      }
      stream.end();
    });

  const stream = client.streamingTranslateSpeech().on('data', response => {
    const {result} = response;
    if (result.textTranslationResult.isFinal) {
      console.log(
        `\nFinal translation: ${result.textTranslationResult.translation}`
      );
      console.log(`Final recognition result: ${result.recognitionResult}`);
    } else {
      console.log(
        `\nPartial translation: ${result.textTranslationResult.translation}`
      );
      console.log(`Partial recognition result: ${result.recognitionResult}`);
    }
  });

Python

Ver en GitHub

from google.cloud import mediatranslation

def translate_from_file(file_path="path/to/your/file"):
    client = mediatranslation.SpeechTranslationServiceClient()

    # The `sample_rate_hertz` field is not required for FLAC and WAV (Linear16)
    # encoded data. Other audio encodings must provide the sampling rate.
    audio_config = mediatranslation.TranslateSpeechConfig(
        audio_encoding="linear16",
        source_language_code="en-US",
        target_language_code="fr-FR",
    )

    streaming_config = mediatranslation.StreamingTranslateSpeechConfig(
        audio_config=audio_config, single_utterance=True
    )

    def request_generator(config, audio_file_path):
        # The first request contains the configuration.
        # Note that audio_content is explicitly set to None.
        yield mediatranslation.StreamingTranslateSpeechRequest(streaming_config=config)

        with open(audio_file_path, "rb") as audio:
            while True:
                chunk = audio.read(4096)
                if not chunk:
                    break
                yield mediatranslation.StreamingTranslateSpeechRequest(
                    audio_content=chunk
                )

    requests = request_generator(streaming_config, file_path)
    responses = client.streaming_translate_speech(requests)

    for response in responses:
        # Once the transcription settles, the response contains the
        # is_final result. The other results will be for subsequent portions of
        # the audio.
        print(f"Response: {response}")
        result = response.result
        translation = result.text_translation_result.translation

        if result.text_translation_result.is_final:
            print(f"\nFinal translation: {translation}")
            break

        print(f"\nPartial translation: {translation}")

Traduce la voz desde un micrófono

Java

Ver en GitHub


import com.google.api.gax.rpc.ClientStream;
import com.google.api.gax.rpc.ResponseObserver;
import com.google.api.gax.rpc.StreamController;
import com.google.cloud.mediatranslation.v1beta1.SpeechTranslationServiceClient;
import com.google.cloud.mediatranslation.v1beta1.StreamingTranslateSpeechConfig;
import com.google.cloud.mediatranslation.v1beta1.StreamingTranslateSpeechRequest;
import com.google.cloud.mediatranslation.v1beta1.StreamingTranslateSpeechResponse;
import com.google.cloud.mediatranslation.v1beta1.StreamingTranslateSpeechResult;
import com.google.cloud.mediatranslation.v1beta1.TranslateSpeechConfig;
import com.google.protobuf.ByteString;
import java.io.IOException;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.TargetDataLine;

public class TranslateFromMic {

  public static void main(String[] args) throws IOException, LineUnavailableException {
    translateFromMic();
  }

  public static void translateFromMic() throws IOException, LineUnavailableException {

    ResponseObserver<StreamingTranslateSpeechResponse> responseObserver = null;

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (SpeechTranslationServiceClient client = SpeechTranslationServiceClient.create()) {
      responseObserver =
          new ResponseObserver<StreamingTranslateSpeechResponse>() {

            @Override
            public void onStart(StreamController controller) {}

            @Override
            public void onResponse(StreamingTranslateSpeechResponse response) {
              StreamingTranslateSpeechResult res = response.getResult();
              String translation = res.getTextTranslationResult().getTranslation();

              if (res.getTextTranslationResult().getIsFinal()) {
                System.out.println(String.format("\nFinal translation: %s", translation));
              } else {
                System.out.println(String.format("\nPartial translation: %s", translation));
              }
            }

            @Override
            public void onComplete() {}

            public void onError(Throwable t) {
              System.out.println(t);
            }
          };

      ClientStream<StreamingTranslateSpeechRequest> clientStream =
          client.streamingTranslateSpeechCallable().splitCall(responseObserver);

      TranslateSpeechConfig audioConfig =
          TranslateSpeechConfig.newBuilder()
              .setAudioEncoding("linear16")
              .setSourceLanguageCode("en-US")
              .setTargetLanguageCode("es-ES")
              .setSampleRateHertz(16000)
              .build();

      StreamingTranslateSpeechConfig streamingRecognitionConfig =
          StreamingTranslateSpeechConfig.newBuilder().setAudioConfig(audioConfig).build();

      StreamingTranslateSpeechRequest request =
          StreamingTranslateSpeechRequest.newBuilder()
              .setStreamingConfig(streamingRecognitionConfig)
              .build(); // The first request in a streaming call has to be a config

      clientStream.send(request);
      // SampleRate:16000Hz, SampleSizeInBits: 16, Number of channels: 1, Signed: true,
      // bigEndian: false
      AudioFormat audioFormat = new AudioFormat(16000, 16, 1, true, false);
      DataLine.Info targetInfo =
          new DataLine.Info(
              TargetDataLine.class,
              audioFormat); // Set the system information to read from the microphone audio stream

      if (!AudioSystem.isLineSupported(targetInfo)) {
        System.out.println("Microphone not supported");
        System.exit(0);
      }
      // Target data line captures the audio stream the microphone produces.
      TargetDataLine targetDataLine = (TargetDataLine) AudioSystem.getLine(targetInfo);
      targetDataLine.open(audioFormat);
      targetDataLine.start();
      System.out.println("Start speaking... Press Ctrl-C to stop");
      long startTime = System.currentTimeMillis();
      // Audio Input Stream
      AudioInputStream audio = new AudioInputStream(targetDataLine);

      while (true) {
        byte[] data = new byte[6400];
        audio.read(data);
        request =
            StreamingTranslateSpeechRequest.newBuilder()
                .setAudioContent(ByteString.copyFrom(data))
                .build();
        clientStream.send(request);
      }
    }
  }
}

Node.js

Ver en GitHub


// Allow user input from terminal
const readline = require('readline');

const rl = readline.createInterface({
  input: process.stdin,
  output: process.stdout,
});

function doTranslationLoop() {
  rl.question("Press any key to translate or 'q' to quit: ", answer => {
    if (answer.toLowerCase() === 'q') {
      rl.close();
    } else {
      translateFromMicrophone();
    }
  });
}

// Node-Record-lpcm16
const recorder = require('node-record-lpcm16');

// Imports the Cloud Media Translation client library
const {
  SpeechTranslationServiceClient,
} = require('@google-cloud/media-translation');

// Creates a client
const client = new SpeechTranslationServiceClient();

function translateFromMicrophone() {
  /**
   * TODO(developer): Uncomment the following lines before running the sample.
   */
  //const encoding = 'linear16';
  //const sampleRateHertz = 16000;
  //const sourceLanguage = 'Language to translate from, as BCP-47 locale';
  //const targetLanguage = 'Language to translate to, as BCP-47 locale';
  console.log('Begin speaking ...');

  const config = {
    audioConfig: {
      audioEncoding: encoding,
      sourceLanguageCode: sourceLanguage,
      targetLanguageCode: targetLanguage,
    },
    singleUtterance: true,
  };

  // First request needs to have only a streaming config, no data.
  const initialRequest = {
    streamingConfig: config,
    audioContent: null,
  };

  let currentTranslation = '';
  let currentRecognition = '';
  // Create a recognize stream
  const stream = client
    .streamingTranslateSpeech()
    .on('error', e => {
      if (e.code && e.code === 4) {
        console.log('Streaming translation reached its deadline.');
      } else {
        console.log(e);
      }
    })
    .on('data', response => {
      const {result, speechEventType} = response;
      if (speechEventType === 'END_OF_SINGLE_UTTERANCE') {
        console.log(`\nFinal translation: ${currentTranslation}`);
        console.log(`Final recognition result: ${currentRecognition}`);

        stream.destroy();
        recording.stop();
      } else {
        currentTranslation = result.textTranslationResult.translation;
        currentRecognition = result.recognitionResult;
        console.log(`\nPartial translation: ${currentTranslation}`);
        console.log(`Partial recognition result: ${currentRecognition}`);
      }
    });

  let isFirst = true;
  // Start recording and send microphone input to the Media Translation API
  const recording = recorder.record({
    sampleRateHertz: sampleRateHertz,
    threshold: 0, //silence threshold
    recordProgram: 'rec',
    silence: '5.0', //seconds of silence before ending
  });
  recording
    .stream()
    .on('data', chunk => {
      if (isFirst) {
        stream.write(initialRequest);
        isFirst = false;
      }
      const request = {
        streamingConfig: config,
        audioContent: chunk.toString('base64'),
      };
      if (!stream.destroyed) {
        stream.write(request);
      }
    })
    .on('close', () => {
      doTranslationLoop();
    });
}

doTranslationLoop();

Python

Ver en GitHub


import itertools
import queue

from google.cloud import mediatranslation as media
import pyaudio

# Audio recording parameters
RATE = 16000
CHUNK = int(RATE / 10)  # 100ms
SpeechEventType = media.StreamingTranslateSpeechResponse.SpeechEventType

class MicrophoneStream:
    """Opens a recording stream as a generator yielding the audio chunks."""

    def __init__(self, rate, chunk):
        self._rate = rate
        self._chunk = chunk

        # Create a thread-safe buffer of audio data
        self._buff = queue.Queue()
        self.closed = True

    def __enter__(self):
        self._audio_interface = pyaudio.PyAudio()
        self._audio_stream = self._audio_interface.open(
            format=pyaudio.paInt16,
            channels=1,
            rate=self._rate,
            input=True,
            frames_per_buffer=self._chunk,
            # Run the audio stream asynchronously to fill the buffer object.
            # This is necessary so that the input device's buffer doesn't
            # overflow while the calling thread makes network requests, etc.
            stream_callback=self._fill_buffer,
        )

        self.closed = False

        return self

    def __exit__(self, type=None, value=None, traceback=None):
        self._audio_stream.stop_stream()
        self._audio_stream.close()
        self.closed = True
        # Signal the generator to terminate so that the client's
        # streaming_recognize method will not block the process termination.
        self._buff.put(None)
        self._audio_interface.terminate()

    def _fill_buffer(self, in_data, frame_count, time_info, status_flags):
        """Continuously collect data from the audio stream, into the buffer."""
        self._buff.put(in_data)
        return None, pyaudio.paContinue

    def exit(self):
        self.__exit__()

    def generator(self):
        while not self.closed:
            # Use a blocking get() to ensure there's at least one chunk of
            # data, and stop iteration if the chunk is None, indicating the
            # end of the audio stream.
            chunk = self._buff.get()
            if chunk is None:
                return
            data = [chunk]

            # Now consume whatever other data's still buffered.
            while True:
                try:
                    chunk = self._buff.get(block=False)
                    if chunk is None:
                        return
                    data.append(chunk)
                except queue.Empty:
                    break

            yield b"".join(data)

def listen_print_loop(responses):
    """Iterates through server responses and prints them.

    The responses passed is a generator that will block until a response
    is provided by the server.
    """
    translation = ""
    for response in responses:
        # Once the transcription settles, the response contains the
        # END_OF_SINGLE_UTTERANCE event.
        if response.speech_event_type == SpeechEventType.END_OF_SINGLE_UTTERANCE:
            print(f"\nFinal translation: {translation}")
            return 0

        result = response.result
        translation = result.text_translation_result.translation

        print(f"\nPartial translation: {translation}")

def do_translation_loop():
    print("Begin speaking...")

    client = media.SpeechTranslationServiceClient()

    speech_config = media.TranslateSpeechConfig(
        audio_encoding="linear16",
        source_language_code="en-US",
        target_language_code="es-ES",
    )

    config = media.StreamingTranslateSpeechConfig(
        audio_config=speech_config, single_utterance=True
    )

    # The first request contains the configuration.
    # Note that audio_content is explicitly set to None.
    first_request = media.StreamingTranslateSpeechRequest(streaming_config=config)

    with MicrophoneStream(RATE, CHUNK) as stream:
        audio_generator = stream.generator()
        mic_requests = (
            media.StreamingTranslateSpeechRequest(audio_content=content)
            for content in audio_generator
        )

        requests = itertools.chain(iter([first_request]), mic_requests)

        responses = client.streaming_translate_speech(requests)

        # Print the translation responses as they arrive
        result = listen_print_loop(responses)
        if result == 0:
            stream.exit()

def main():
    while True:
        print()
        option = input("Press any key to translate or 'q' to quit: ")

        if option.lower() == "q":
            break

        do_translation_loop()

if __name__ == "__main__":
    main()