Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
Usa un modelo entrenado de Speech-to-Text personalizado en tus flujos de trabajo de aplicación de producción o de comparativas. En cuanto implementes tu modelo a través de un extremo dedicado, obtendrás acceso programático de manera automática a través de un objeto de reconocimiento, que se puede usar directamente a través de la API de Speech-to-Text V2 o en la Google Cloud consola.
Antes de comenzar
Asegúrate de haberte registrado en una Google Cloud cuenta, de haber creado un proyecto y de haber entrenado un modelo de voz personalizado y haberlo implementado con un extremo.
Realiza inferencias en V2
Para que un modelo personalizado de Speech-to-Text esté listo para usarse, el estado del modelo en la pestaña Modelos debería ser Activo, y el extremo dedicado en la pestaña Extremos debe ser Implementado.
En nuestro ejemplo, en el que el ID de un proyecto Google Cloud es custom-models-walkthrough, el extremo que corresponde al modelo personalizado de Speech-to-Text quantum-computing-lectures-custom-model es quantum-computing-lectures-custom-model-prod-endpoint. La región en la que está disponible es us-east1, y la solicitud de transcripción por lotes es la siguiente:
fromgoogle.api_coreimportclient_optionsfromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechdefquickstart_v2(project_id:str,audio_file:str,)-> cloud_speech.RecognizeResponse:"""Transcribe an audio file."""# Instantiates a clientclient=SpeechClient(client_options=client_options.ClientOptions(api_endpoint="us-east1-speech.googleapis.com"))# Reads a file as byteswithopen(audio_file,"rb")asf:content=f.read()config=cloud_speech.RecognitionConfig(auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),language_codes=["en-US"],model="projects/custom-models-walkthrough/locations/us-east1/endpoints/quantum-computing-lectures-custom-model-prod-endpoint",)request=cloud_speech.RecognizeRequest(recognizer=f"projects/custom-models-walkthrough/locations/us-east1/recognizers/_",config=config,content=content,)# Transcribes the audio into textresponse=client.recognize(request=request)forresultinresponse.results:print(f"Transcript: {result.alternatives[0].transcript}")returnresponse
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],["Última actualización: 2025-09-04 (UTC)"],[],[],null,["# Use models\n\n| **Preview**\n|\n|\n| This feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nUse a trained Custom Speech-to-Text model in your production application or benchmarking workflows. As soon as you deploy your model through a dedicated endpoint, you automatically get programmatic access through a recognizer object, which can be used directly through the Speech-to-Text V2 API or in the Google Cloud console.\n\nBefore you begin\n----------------\n\nEnsure you have signed up for a Google Cloud account, created a project, trained a custom speech model, and deployed it using an endpoint.\n\nPerform inference in V2\n-----------------------\n\nFor a Custom Speech-to-Text model to be ready for use, the state of the model in the **Models** tab should be **Active** , and the dedicated endpoint in the **Endpoints** tab must be **Deployed**.\n\nIn our example, where a Google Cloud project ID is `custom-models-walkthrough`, the endpoint that corresponds to the Custom Speech-to-Text model `quantum-computing-lectures-custom-model` is `quantum-computing-lectures-custom-model-prod-endpoint`. The region that it's available is `us-east1`, and the batch transcription request is the following: \n\n from google.api_core import client_options\n from google.cloud.speech_v2 import SpeechClient\n from google.cloud.speech_v2.types import cloud_speech\n\n def quickstart_v2(\n project_id: str,\n audio_file: str,\n ) -\u003e cloud_speech.RecognizeResponse:\n \"\"\"Transcribe an audio file.\"\"\"\n # Instantiates a client\n client = SpeechClient(\n client_options=client_options.ClientOptions(\n api_endpoint=\"us-east1-speech.googleapis.com\"\n )\n )\n\n # Reads a file as bytes\n with open(audio_file, \"rb\") as f:\n content = f.read()\n\n config = cloud_speech.RecognitionConfig(\n auto_decoding_config=cloud_speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v2.types.AutoDetectDecodingConfig.html(),\n language_codes=[\"en-US\"],\n model=\"projects/custom-models-walkthrough/locations/us-east1/endpoints/quantum-computing-lectures-custom-model-prod-endpoint\",\n )\n request = cloud_speech.RecognizeRequest(\n recognizer=f\"projects/custom-models-walkthrough/locations/us-east1/recognizers/_\",\n config=config,\n content=content,\n )\n\n # Transcribes the audio into text\n response = client.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.services.speech.SpeechClient.html#google_cloud_speech_v1_services_speech_SpeechClient_recognize(request=request)\n\n for result in response.results:\n print(f\"Transcript: {result.alternatives[0].transcript}\")\n\n return response\n\n| **Note:** If you try to create a recognizer object in a different region than the one that the endpoint is created in, the request will fail.\n\nWhat's next\n-----------\n\nFollow the resources to take advantage of custom speech models in your application. See [Evaluate your custom models](/speech-to-text/v2/docs/custom-speech-models/evaluate-model)."]]