Cloud Speech-to-Text Client Libraries

This page shows how to get started with the Cloud Client Libraries for the Speech-to-Text API. Read more about the client libraries for Cloud APIs, including the older Google APIs Client Libraries, in Client Libraries Explained.

Installing the client library

C#

For more information, see Setting Up a C# Development Environment.
Install-Package Google.Cloud.Speech.V1 -Pre

Go

go get -u cloud.google.com/go/speech/apiv1

Java

For more information, see Setting Up a Java Development Environment. Si usas Maven, agrega lo siguiente al archivo pom.xml:
<dependencyManagement>
 <dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>libraries-bom</artifactId>
    <version>2.8.0</version>
    <type>pom</type>
    <scope>import</scope>
   </dependency>
 </dependencies>
</dependencyManagement>

<dependency>
  <groupId>com.google.cloud</groupId>
  <artifactId>google-cloud-speech</artifactId>
</dependency>
Si usas Gradle, agrega lo siguiente a tus dependencias:
compile 'com.google.cloud:google-cloud-speech:1.21.0'
Si usas SBT, agrega lo siguiente a tus dependencias:
libraryDependencies += "com.google.cloud" % "google-cloud-speech" % "1.21.0"

Si usas IntelliJ o Eclipse, puedes agregar bibliotecas cliente a tu proyecto mediante los siguientes complementos IDE:

Los complementos brindan funcionalidades adicionales, como administración de claves para las cuentas de servicio. Consulta la documentación de cada complemento para obtener más detalles.

Node.js

For more information, see Setting Up a Node.js Development Environment.
npm install --save @google-cloud/speech

PHP

composer require google/cloud-speech

Python

For more information, see Setting Up a Python Development Environment.
pip install --upgrade google-cloud-speech

Ruby

For more information, see Setting Up a Ruby Development Environment.
gem install google-cloud-speech

Setting up authentication

To run the client library, you must first set up authentication by creating a service account and setting an environment variable. Complete the following steps to set up authentication. For other ways to authenticate, see the GCP authentication documentation.

GCP Console

  1. En GCP Console, ve a la página Crear clave de la cuenta de servicio.

    Ir a la página Crear clave de la cuenta de servicio
  2. En la lista Cuenta de servicio, selecciona Cuenta de servicio nueva.
  3. Ingresa un nombre en el campo Nombre de cuenta de servicio.
  4. En la lista Función, selecciona Proyecto > Propietario.

    Nota: El campo Función autoriza tu cuenta de servicio para acceder a los recursos. Puedes ver y cambiar este campo luego con GCP Console. Si desarrollas una app de producción, especifica permisos más detallados que Proyecto > Propietario. Para obtener más información, consulta Cómo otorgar funciones a las cuentas de servicio.
  5. Haz clic en Crear. Se descargará un archivo JSON a tu computadora que contiene tus descargas de claves.

Línea de comandos

Puedes ejecutar los siguientes comandos con el SDK de Cloud en tu máquina local o en Cloud Shell.

  1. Crea la cuenta de servicio. Reemplaza [NOMBRE] por un nombre para la cuenta de servicio.

    gcloud iam service-accounts create [NAME]
  2. Otorga permisos a la cuenta de servicio. Reemplaza [PROJECT_ID] por el ID del proyecto.

    gcloud projects add-iam-policy-binding [PROJECT_ID] --member "serviceAccount:[NAME]@[PROJECT_ID].iam.gserviceaccount.com" --role "roles/owner"
    Nota: El campo Función autoriza a tu cuenta de servicio para acceder a los recursos. Puedes ver y cambiar este campo luego con GCP Console. Si desarrollas una app de producción, especifica permisos más detallados que Proyecto > Propietario. Para obtener más información, consulta Cómo otorgar funciones a las cuentas de servicio.
  3. Genera el archivo de claves. Reemplaza [FILE_NAME] por un nombre para el archivo de claves.

    gcloud iam service-accounts keys create [FILE_NAME].json --iam-account [NAME]@[PROJECT_ID].iam.gserviceaccount.com

Proporciona credenciales de autenticación a tu código de la aplicación configurando la variable de entorno GOOGLE_APPLICATION_CREDENTIALS. Reemplaza [PATH] con la ruta de acceso al archivo JSON que contiene la clave de tu cuenta de servicio y [FILE_NAME] con el nombre del archivo. Esta variable solo se aplica a la sesión actual de shell. Por lo tanto, si abres una sesión nueva, deberás volver a configurar la variable.

Linux o macOS

export GOOGLE_APPLICATION_CREDENTIALS="[PATH]"

Por ejemplo:

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/[FILE_NAME].json"

Windows

Con PowerShell:

$env:GOOGLE_APPLICATION_CREDENTIALS="[PATH]"

Por ejemplo:

$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\[FILE_NAME].json"

Con el símbolo del sistema:

set GOOGLE_APPLICATION_CREDENTIALS=[PATH]

Using the client library

The following example shows how to use the client library.

C#


using Google.Cloud.Speech.V1;
using System;

namespace GoogleCloudSamples
{
    public class QuickStart
    {
        // The name of the local audio file to transcribe
        public static string DEMO_FILE = "audio.raw";
        public static void Main(string[] args)
        {
            var speech = SpeechClient.Create();
            var response = speech.Recognize(new RecognitionConfig()
            {
                Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
                SampleRateHertz = 16000,
                LanguageCode = "en",
            }, RecognitionAudio.FromFile(DEMO_FILE));
            foreach (var result in response.Results)
            {
                foreach (var alternative in result.Alternatives)
                {
                    Console.WriteLine(alternative.Transcript);
                }
            }
        }
    }
}

Go


// Sample speech-quickstart uses the Google Cloud Speech API to transcribe
// audio.
package main

import (
	"context"
	"fmt"
	"io/ioutil"
	"log"

	speech "cloud.google.com/go/speech/apiv1"
	speechpb "google.golang.org/genproto/googleapis/cloud/speech/v1"
)

func main() {
	ctx := context.Background()

	// Creates a client.
	client, err := speech.NewClient(ctx)
	if err != nil {
		log.Fatalf("Failed to create client: %v", err)
	}

	// Sets the name of the audio file to transcribe.
	filename := "/path/to/audio.raw"

	// Reads the audio file into memory.
	data, err := ioutil.ReadFile(filename)
	if err != nil {
		log.Fatalf("Failed to read file: %v", err)
	}

	// Detects speech in the audio file.
	resp, err := client.Recognize(ctx, &speechpb.RecognizeRequest{
		Config: &speechpb.RecognitionConfig{
			Encoding:        speechpb.RecognitionConfig_LINEAR16,
			SampleRateHertz: 16000,
			LanguageCode:    "en-US",
		},
		Audio: &speechpb.RecognitionAudio{
			AudioSource: &speechpb.RecognitionAudio_Content{Content: data},
		},
	})
	if err != nil {
		log.Fatalf("failed to recognize: %v", err)
	}

	// Prints the results.
	for _, result := range resp.Results {
		for _, alt := range result.Alternatives {
			fmt.Printf("\"%v\" (confidence=%3f)\n", alt.Transcript, alt.Confidence)
		}
	}
}

Java

// Imports the Google Cloud client library
import com.google.cloud.speech.v1.RecognitionAudio;
import com.google.cloud.speech.v1.RecognitionConfig;
import com.google.cloud.speech.v1.RecognitionConfig.AudioEncoding;
import com.google.cloud.speech.v1.RecognizeResponse;
import com.google.cloud.speech.v1.SpeechClient;
import com.google.cloud.speech.v1.SpeechRecognitionAlternative;
import com.google.cloud.speech.v1.SpeechRecognitionResult;
import com.google.protobuf.ByteString;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;

public class QuickstartSample {

  /**
   * Demonstrates using the Speech API to transcribe an audio file.
   */
  public static void main(String... args) throws Exception {
    // Instantiates a client
    try (SpeechClient speechClient = SpeechClient.create()) {

      // The path to the audio file to transcribe
      String fileName = "./resources/audio.raw";

      // Reads the audio file into memory
      Path path = Paths.get(fileName);
      byte[] data = Files.readAllBytes(path);
      ByteString audioBytes = ByteString.copyFrom(data);

      // Builds the sync recognize request
      RecognitionConfig config = RecognitionConfig.newBuilder()
          .setEncoding(AudioEncoding.LINEAR16)
          .setSampleRateHertz(16000)
          .setLanguageCode("en-US")
          .build();
      RecognitionAudio audio = RecognitionAudio.newBuilder()
          .setContent(audioBytes)
          .build();

      // Performs speech recognition on the audio file
      RecognizeResponse response = speechClient.recognize(config, audio);
      List<SpeechRecognitionResult> results = response.getResultsList();

      for (SpeechRecognitionResult result : results) {
        // There can be several alternative transcripts for a given chunk of speech. Just use the
        // first (most likely) one here.
        SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
        System.out.printf("Transcription: %s%n", alternative.getTranscript());
      }
    }
  }
}

Node.js

async function main() {
  // Imports the Google Cloud client library
  const speech = require('@google-cloud/speech');
  const fs = require('fs');

  // Creates a client
  const client = new speech.SpeechClient();

  // The name of the audio file to transcribe
  const fileName = './resources/audio.raw';

  // Reads a local audio file and converts it to base64
  const file = fs.readFileSync(fileName);
  const audioBytes = file.toString('base64');

  // The audio file's encoding, sample rate in hertz, and BCP-47 language code
  const audio = {
    content: audioBytes,
  };
  const config = {
    encoding: 'LINEAR16',
    sampleRateHertz: 16000,
    languageCode: 'en-US',
  };
  const request = {
    audio: audio,
    config: config,
  };

  // Detects speech in the audio file
  const [response] = await client.recognize(request);
  const transcription = response.results
    .map(result => result.alternatives[0].transcript)
    .join('\n');
  console.log(`Transcription: ${transcription}`);
}
main().catch(console.error);

PHP

# Includes the autoloader for libraries installed with composer
require __DIR__ . '/vendor/autoload.php';

# Imports the Google Cloud client library
use Google\Cloud\Speech\V1\SpeechClient;
use Google\Cloud\Speech\V1\RecognitionAudio;
use Google\Cloud\Speech\V1\RecognitionConfig;
use Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding;

# The name of the audio file to transcribe
$audioFile = __DIR__ . '/test/data/audio32KHz.raw';

# get contents of a file into a string
$content = file_get_contents($audioFile);

# set string as audio content
$audio = (new RecognitionAudio())
    ->setContent($content);

# The audio file's encoding, sample rate and language
$config = new RecognitionConfig([
    'encoding' => AudioEncoding::LINEAR16,
    'sample_rate_hertz' => 32000,
    'language_code' => 'en-US'
]);

# Instantiates a client
$client = new SpeechClient();

# Detects speech in the audio file
$response = $client->recognize($config, $audio);

# Print most likely transcription
foreach ($response->getResults() as $result) {
    $alternatives = $result->getAlternatives();
    $mostLikely = $alternatives[0];
    $transcript = $mostLikely->getTranscript();
    printf('Transcript: %s' . PHP_EOL, $transcript);
}

$client->close();

Python

import io
import os

# Imports the Google Cloud client library
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types

# Instantiates a client
client = speech.SpeechClient()

# The name of the audio file to transcribe
file_name = os.path.join(
    os.path.dirname(__file__),
    'resources',
    'audio.raw')

# Loads the audio into memory
with io.open(file_name, 'rb') as audio_file:
    content = audio_file.read()
    audio = types.RecognitionAudio(content=content)

config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=16000,
    language_code='en-US')

# Detects speech in the audio file
response = client.recognize(config, audio)

for result in response.results:
    print('Transcript: {}'.format(result.alternatives[0].transcript))

Ruby

# Imports the Google Cloud client library
require "google/cloud/speech"

# Instantiates a client
speech = Google::Cloud::Speech.new

# The name of the audio file to transcribe
file_name = "./resources/audio.raw"

# The raw audio
audio_file = File.binread file_name

# The audio file's encoding and sample rate
config = { encoding:          :LINEAR16,
           sample_rate_hertz: 16_000,
           language_code:     "en-US" }
audio  = { content: audio_file }

# Detects speech in the audio file
response = speech.recognize config, audio

results = response.results

# Get first result because we only processed a single audio file
# Each result represents a consecutive portion of the audio
results.first.alternatives.each do |alternatives|
  puts "Transcription: #{alternatives.transcript}"
end

Additional resources

¿Te ha resultado útil esta página? Enviar comentarios:

Enviar comentarios sobre...

Cloud Speech-to-Text Documentation
Si necesitas ayuda, visita nuestra página de asistencia.