gcloud alpha ml speech recognizers create

NAME: gcloud alpha ml speech recognizers create - create a speech-to-text recognizer
SYNOPSIS: gcloud alpha ml speech recognizers create (RECOGNIZER : --location=LOCATION) --language-codes=[LANGUAGE_CODE,…] --model=MODEL [--async] [--display-name=DISPLAY_NAME] [--audio-channel-count=AUDIO_CHANNEL_COUNT --encoding=ENCODING --sample-rate=SAMPLE_RATE] [--enable-automatic-punctuation --enable-spoken-emojis --enable-spoken-punctuation --enable-word-confidence --enable-word-time-offsets --max-alternatives=MAX_ALTERNATIVES --profanity-filter --separate-channel-recognition --max-speaker-count=MAX_SPEAKER_COUNT --min-speaker-count=MIN_SPEAKER_COUNT] [GCLOUD_WIDE_FLAG …]
DESCRIPTION: (ALPHA) Create a speech-to-text recognizer.
POSITIONAL ARGUMENTS: Recognizer resource - recognizer. The arguments in this group can be used to specify the attributes of this resource. (NOTE) Some attributes are not given arguments in this group but can be set in other ways.
To set the project attribute:

provide the argument recognizer on the command line with a fully specified name;

provide the argument --project on the command line;

set the property core/project.

This must be specified.

RECOGNIZER

ID of the recognizer or fully qualified identifier for the recognizer.
To set the recognizer attribute:

provide the argument recognizer on the command line.

This positional argument must be specified if any of the other arguments in this group are specified.

--location=LOCATION

Location of the recognizer. To set the location attribute:

provide the argument recognizer on the command line with a fully specified name;

provide the argument --location on the command line.
REQUIRED FLAGS: --language-codes=[LANGUAGE_CODE,…]

Language code is one of en-US, en-GB, fr-FR. Check documentation for using more than one language code.

--model=MODEL

Which model to use for recognition requests. Select the model best suited to your domain to get best results. Guidance for choosing which model to use can be found in the Transcription Models Documentation and the models supported in each region can be found in the Table Of Supported Models.
OPTIONAL FLAGS: --async

Return immediately, without waiting for the operation in progress to complete. The default is False.

--display-name=DISPLAY_NAME

Name of this recognizer as it appears in UIs.

Encoding format

--audio-channel-count=AUDIO_CHANNEL_COUNT

Number of channels present in the audio data sent for recognition. Required if --encoding flag is specified and is not AUTO. Must be set to a value between 1 and 8.

--encoding=ENCODING

Encoding format of the provided audio. For headerless formats, must be set to LINEAR16, MULAW, or ALAW. For other formats, set to AUTO. Overrides the recognizer configuration if present, else uses recognizer encoding.

--sample-rate=SAMPLE_RATE

Sample rate in Hertz of the audio data sent for recognition. Required if --encoding flag is specified and is not AUTO. Must be set to a value between 8000 and 48000.

ASR Features

--enable-automatic-punctuation

If set, adds punctuation to recognition result hypotheses.

--enable-spoken-emojis

If set, adds spoken emoji formatting.

--enable-spoken-punctuation

If set, replaces spoken punctuation with the corresponding symbols in the request.

--enable-word-confidence

If set, the top result includes a list of words and the confidence for those words.

--enable-word-time-offsets

If set, the top result includes a list of words and their timestamps.

--max-alternatives=MAX_ALTERNATIVES

Maximum number of recognition hypotheses to be returned. Must be set to a value between 1 and 30.

--profanity-filter

If set, the server will censor profanities.

--separate-channel-recognition

Mode for recognizing multi-channel audio using Separate Channel Recognition. When set, the service will recognize each channel independently.

Speaker Diarization

--max-speaker-count=MAX_SPEAKER_COUNT

Maximum number of speakers in the conversation. Must be greater than or equal to --min-speaker-count. Must be set to a value between 1 and 6.
This flag argument must be specified if any of the other arguments in this group are specified.

--min-speaker-count=MIN_SPEAKER_COUNT

Minimum number of speakers in the conversation. Must be less than or equal to --max-speaker-count. Must be set to a value between 1 and 6.
This flag argument must be specified if any of the other arguments in this group are specified.
GCLOUD WIDE FLAGS: These flags are available to all commands: --access-token-file, --account, --billing-project, --configuration, --flags-file, --flatten, --format, --help, --impersonate-service-account, --log-http, --project, --quiet, --trace-token, --user-output-enabled, --verbosity.
Run $ gcloud help for details.
NOTES: This command is currently in alpha and might change without notice. If this command fails with API permission errors despite specifying the correct project, you might be trying to access an API with an invitation-only early access allowlist.