Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Use um modelo de conversão de voz em texto personalizado preparado na sua aplicação de produção ou fluxos de trabalho de testes de referência. Assim que implementar o seu modelo através de um ponto final dedicado, recebe automaticamente acesso programático através de um objeto de reconhecimento, que pode ser usado diretamente através da API Speech-to-Text V2 ou na Google Cloud consola.
Antes de começar
Certifique-se de que se inscreveu numa Google Cloud conta, criou um projeto, preparou um modelo de voz personalizado e implementou-o através de um ponto final.
Realize a inferência na V2
Para que um modelo de conversão de voz em texto personalizado esteja pronto para utilização, o estado do modelo no separador Modelos deve ser Ativo e o ponto final dedicado no separador Pontos finais tem de estar Implementado.
No nosso exemplo, em que um Google Cloud ID do projeto é custom-models-walkthrough, o ponto final que corresponde ao modelo Speech-to-Text personalizado quantum-computing-lectures-custom-model é quantum-computing-lectures-custom-model-prod-endpoint. A região em que está disponível é us-east1, e o pedido de transcrição em lote é o seguinte:
fromgoogle.api_coreimportclient_optionsfromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechdefquickstart_v2(project_id:str,audio_file:str,)-> cloud_speech.RecognizeResponse:"""Transcribe an audio file."""# Instantiates a clientclient=SpeechClient(client_options=client_options.ClientOptions(api_endpoint="us-east1-speech.googleapis.com"))# Reads a file as byteswithopen(audio_file,"rb")asf:content=f.read()config=cloud_speech.RecognitionConfig(auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),language_codes=["en-US"],model="projects/custom-models-walkthrough/locations/us-east1/endpoints/quantum-computing-lectures-custom-model-prod-endpoint",)request=cloud_speech.RecognizeRequest(recognizer=f"projects/custom-models-walkthrough/locations/us-east1/recognizers/_",config=config,content=content,)# Transcribes the audio into textresponse=client.recognize(request=request)forresultinresponse.results:print(f"Transcript: {result.alternatives[0].transcript}")returnresponse
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-08-20 UTC."],[],[],null,["# Use models\n\n| **Preview**\n|\n|\n| This feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nUse a trained Custom Speech-to-Text model in your production application or benchmarking workflows. As soon as you deploy your model through a dedicated endpoint, you automatically get programmatic access through a recognizer object, which can be used directly through the Speech-to-Text V2 API or in the Google Cloud console.\n\nBefore you begin\n----------------\n\nEnsure you have signed up for a Google Cloud account, created a project, trained a custom speech model, and deployed it using an endpoint.\n\nPerform inference in V2\n-----------------------\n\nFor a Custom Speech-to-Text model to be ready for use, the state of the model in the **Models** tab should be **Active** , and the dedicated endpoint in the **Endpoints** tab must be **Deployed**.\n\nIn our example, where a Google Cloud project ID is `custom-models-walkthrough`, the endpoint that corresponds to the Custom Speech-to-Text model `quantum-computing-lectures-custom-model` is `quantum-computing-lectures-custom-model-prod-endpoint`. The region that it's available is `us-east1`, and the batch transcription request is the following: \n\n from google.api_core import client_options\n from google.cloud.speech_v2 import SpeechClient\n from google.cloud.speech_v2.types import cloud_speech\n\n def quickstart_v2(\n project_id: str,\n audio_file: str,\n ) -\u003e cloud_speech.RecognizeResponse:\n \"\"\"Transcribe an audio file.\"\"\"\n # Instantiates a client\n client = SpeechClient(\n client_options=client_options.ClientOptions(\n api_endpoint=\"us-east1-speech.googleapis.com\"\n )\n )\n\n # Reads a file as bytes\n with open(audio_file, \"rb\") as f:\n content = f.read()\n\n config = cloud_speech.RecognitionConfig(\n auto_decoding_config=cloud_speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v2.types.AutoDetectDecodingConfig.html(),\n language_codes=[\"en-US\"],\n model=\"projects/custom-models-walkthrough/locations/us-east1/endpoints/quantum-computing-lectures-custom-model-prod-endpoint\",\n )\n request = cloud_speech.RecognizeRequest(\n recognizer=f\"projects/custom-models-walkthrough/locations/us-east1/recognizers/_\",\n config=config,\n content=content,\n )\n\n # Transcribes the audio into text\n response = client.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.services.speech.SpeechClient.html#google_cloud_speech_v1_services_speech_SpeechClient_recognize(request=request)\n\n for result in response.results:\n print(f\"Transcript: {result.alternatives[0].transcript}\")\n\n return response\n\n| **Note:** If you try to create a recognizer object in a different region than the one that the endpoint is created in, the request will fail.\n\nWhat's next\n-----------\n\nFollow the resources to take advantage of custom speech models in your application. See [Evaluate your custom models](/speech-to-text/v2/docs/custom-speech-models/evaluate-model)."]]