importosfromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")defquickstart_v2(audio_file:str)-> cloud_speech.RecognizeResponse:"""Transcribe an audio file. Args: audio_file (str): Path to the local audio file to be transcribed. Returns: cloud_speech.RecognizeResponse: The response from the recognize request, containing the transcription results """# Reads a file as byteswithopen(audio_file,"rb")asf:audio_content=f.read()# Instantiates a clientclient=SpeechClient()config=cloud_speech.RecognitionConfig(auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),language_codes=["en-US"],model="long",)request=cloud_speech.RecognizeRequest(recognizer=f"projects/{PROJECT_ID}/locations/global/recognizers/_",config=config,content=audio_content,)# Transcribes the audio into textresponse=client.recognize(request=request)forresultinresponse.results:print(f"Transcript: {result.alternatives[0].transcript}")returnresponse
importosfromgoogle.api_core.client_optionsimportClientOptionsfromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")defchange_speech_v2_location(audio_file:str,location:str)-> cloud_speech.RecognizeResponse:"""Transcribe an audio file in a specific region. It allows for specifying the location to potentially reduce latency and meet data residency requirements. Args: audio_file (str): Path to the local audio file to be transcribed. location (str): The region where the Speech API will be accessed. E.g., "europe-west3" Returns: cloud_speech.RecognizeResponse: The full response object which includes the transcription results. """# Reads a file as byteswithopen(audio_file,"rb")asf:audio_content=f.read()# Instantiates a client to a regionalized Speech endpoint.client=SpeechClient(client_options=ClientOptions(api_endpoint=f"{location}-speech.googleapis.com",))config=cloud_speech.RecognitionConfig(auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),language_codes=["en-US"],model="long",)request=cloud_speech.RecognizeRequest(recognizer=f"projects/{PROJECT_ID}/locations/{location}/recognizers/_",config=config,content=audio_content,)# Transcribes the audio into textresponse=client.recognize(request=request)forresultinresponse.results:print(f"Transcript: {result.alternatives[0].transcript}")returnresponse
importosfromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")defcreate_recognizer(recognizer_id:str)-> cloud_speech.Recognizer:"""Сreates a recognizer with an unique ID and default recognition configuration. Args: recognizer_id (str): The unique identifier for the recognizer to be created. Returns: cloud_speech.Recognizer: The created recognizer object with configuration. """# Instantiates a clientclient=SpeechClient()request=cloud_speech.CreateRecognizerRequest(parent=f"projects/{PROJECT_ID}/locations/global",recognizer_id=recognizer_id,recognizer=cloud_speech.Recognizer(default_recognition_config=cloud_speech.RecognitionConfig(language_codes=["en-US"],model="long"),),)# Sends the request to create a recognizer and waits for the operation to completeoperation=client.create_recognizer(request=request)recognizer=operation.result()print("Created Recognizer:",recognizer.name)returnrecognizer
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[],[],null,["# Migrating from Speech-to-Text v1 to v2\n\nSpeech-to-Text API v2 brings the latest Google Cloud API design for\ncustomers to meet enterprise security and regulatory requirements out of\nthe box.\n\nThese requirements are realized through the following:\n\n- [**Data Residency**](/speech-to-text/v2/docs/locations): Speech-to-Text v2 offers the broad\n range of our existing transcription models in\n [Google Cloud regions](https://cloud.google.com/about/locations)\n such as Belgium or Singapore. This allows the invocation of our\n transcription models through a fully regionalized service.\n\n- [**Recognizer Resourcefulness**](/speech-to-text/v2/docs/recognizers): Recognizers are reusable\n recognition configurations that can contain a combination of model,\n language, and features. This resourceful implementation eliminates the need\n for dedicated service accounts for authentication and authorization.\n\n- **Logging**: Resource creation and transcriptions generate logs available\n in the Google Cloud console, allowing for better telemetry and debugging.\n\n- [**Encryption**](/speech-to-text/v2/docs/encryption): Speech-to-Text v2 supports\n [Customer-managed encryption keys](/kms/docs/cmek) for all resources as well\n as batch transcription.\n\n- [**Audio Auto-Detect**](/speech-to-text/v2/docs/encoding): Speech-to-Text v2 can automatically\n detect the sample rate, channel count, and format of your audio files,\n without needing to provide that information in the request configuration.\n\nMigrating from v1 to v2\n-----------------------\n\nMigration from the v1 API to the v2 API does not happen automatically. Minimal\nimplementation changes are required to take advantage of the feature set.\n\n### Migrating in API\n\nSimilar to Speech-to-Text v1, to [transcribe audio](/speech-to-text/v2/docs/transcribe-client-libraries),\nyou need to create a [`RecognitionConfig`](/speech-to-text/v2/docs/reference/rpc/google.cloud.speech.v2#recognitionconfig) by\nselecting the language of your audio and the recognition model of your\nchoice:\n**Note:** The difference between the v1 and v2 versions of the Speech-to-Text API in the definition of `RecognitionConfig` message is the addition of the [`AutoDetectDecodingConfig`](/speech-to-text/v2/docs/reference/rpc/google.cloud.speech.v2#autodetectdecodingconfig) message, which automatically detects the audio specifications. \n\n### Python\n\n import os\n\n from google.cloud.speech_v2 import SpeechClient\n from google.cloud.speech_v2.types import cloud_speech\n\n PROJECT_ID = os.getenv(\"GOOGLE_CLOUD_PROJECT\")\n\n\n def quickstart_v2(audio_file: str) -\u003e cloud_speech.RecognizeResponse:\n \"\"\"Transcribe an audio file.\n Args:\n audio_file (str): Path to the local audio file to be transcribed.\n Returns:\n cloud_speech.RecognizeResponse: The response from the recognize request, containing\n the transcription results\n \"\"\"\n # Reads a file as bytes\n with open(audio_file, \"rb\") as f:\n audio_content = f.read()\n\n # Instantiates a client\n client = SpeechClient()\n\n config = cloud_speech.RecognitionConfig(\n auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),\n language_codes=[\"en-US\"],\n model=\"long\",\n )\n\n request = cloud_speech.RecognizeRequest(\n recognizer=f\"projects/{PROJECT_ID}/locations/global/recognizers/_\",\n config=config,\n content=audio_content,\n )\n\n # Transcribes the audio into text\n response = client.recognize(request=request)\n\n for result in response.results:\n print(f\"Transcript: {result.alternatives[0].transcript}\")\n\n return response\n\nIf needed, [select a region](/speech-to-text/v2/docs/locations) in which you want to use the Speech-to-Text API,\nand check the [language and model availability](/speech-to-text/v2/docs/speech-to-text-supported-languages) in that region:\n\n### Python\n\n import os\n\n from google.api_core.client_options import ClientOptions\n from google.cloud.speech_v2 import SpeechClient\n from google.cloud.speech_v2.types import cloud_speech\n\n PROJECT_ID = os.getenv(\"GOOGLE_CLOUD_PROJECT\")\n\n\n def change_speech_v2_location(\n audio_file: str, location: str\n ) -\u003e cloud_speech.RecognizeResponse:\n \"\"\"Transcribe an audio file in a specific region. It allows for specifying the location\n to potentially reduce latency and meet data residency requirements.\n Args:\n audio_file (str): Path to the local audio file to be transcribed.\n location (str): The region where the Speech API will be accessed.\n E.g., \"europe-west3\"\n Returns:\n cloud_speech.RecognizeResponse: The full response object which includes the transcription results.\n \"\"\"\n # Reads a file as bytes\n with open(audio_file, \"rb\") as f:\n audio_content = f.read()\n\n # Instantiates a client to a regionalized Speech endpoint.\n client = SpeechClient(\n client_options=ClientOptions(\n api_endpoint=f\"{location}-speech.googleapis.com\",\n )\n )\n\n config = cloud_speech.RecognitionConfig(\n auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),\n language_codes=[\"en-US\"],\n model=\"long\",\n )\n\n request = cloud_speech.RecognizeRequest(\n recognizer=f\"projects/{PROJECT_ID}/locations/{location}/recognizers/_\",\n config=config,\n content=audio_content,\n )\n\n # Transcribes the audio into text\n response = client.recognize(request=request)\n\n for result in response.results:\n print(f\"Transcript: {result.alternatives[0].transcript}\")\n return response\n\nOptionally, [create a recognizer resource](/speech-to-text/v2/docs/recognizers) if you need to reuse a\nspecific recognition configuration across many transcription requests: \n\n### Python\n\n import os\n\n from google.cloud.speech_v2 import SpeechClient\n from google.cloud.speech_v2.types import cloud_speech\n\n PROJECT_ID = os.getenv(\"GOOGLE_CLOUD_PROJECT\")\n\n\n def create_recognizer(recognizer_id: str) -\u003e cloud_speech.Recognizer:\n \"\"\"Сreates a recognizer with an unique ID and default recognition configuration.\n Args:\n recognizer_id (str): The unique identifier for the recognizer to be created.\n Returns:\n cloud_speech.Recognizer: The created recognizer object with configuration.\n \"\"\"\n # Instantiates a client\n client = SpeechClient()\n\n request = cloud_speech.CreateRecognizerRequest(\n parent=f\"projects/{PROJECT_ID}/locations/global\",\n recognizer_id=recognizer_id,\n recognizer=cloud_speech.Recognizer(\n default_recognition_config=cloud_speech.RecognitionConfig(\n language_codes=[\"en-US\"], model=\"long\"\n ),\n ),\n )\n # Sends the request to create a recognizer and waits for the operation to complete\n operation = client.create_recognizer(request=request)\n recognizer = operation.result()\n\n print(\"Created Recognizer:\", recognizer.name)\n return recognizer\n\nThere are other differences in the requests and responses in the new v2 API.\nFor more details, see the [reference documentation](/speech-to-text/v2/docs/apis).\n\n### Migrating in UI\n\nTo migrate through Speech Google Cloud console, follow these steps:\n\n1. Go to [Speech Google Cloud console](https://console.cloud.google.com/speech).\n\n2. Navigate to the **Transcriptions** Page.\n\n3. Click **New Transcription** and select your audio in the **Audio configuration** tab.\n\n4. In the **Transcription options** tab, select **V2**."]]