Stay organized with collections
Save and categorize content based on your preferences.
Choose a transcription function
This document provides a comparison of the transcription functions
available in BigQuery ML, which are
ML.GENERATE_TEXT
and
ML.TRANSCRIBE.
You can use the information in this document to help you decide which function
to use in cases where the functions have overlapping capabilities.
At a high level, the difference between these functions is as follows:
ML.GENERATE_TEXT is a good choice for transcription of audio clips that are
10 minutes or shorter, and you can also use it to perform natural language
processing (NLP) tasks. Audio transcription with ML.GENERATE_TEXT is less
expensive than with ML.TRANSCRIBE when you use the gemini-1.5-flash model.
ML.TRANSCRIBE is a good choice for performing transcription on audio
clips that are longer than 10 minutes. It also supports a wider range of
languages than ML.GENERATE_TEXT.
Supported models
Supported models are as follows:
ML.GENERATE_TEXT: you can use a subset of the Vertex AI
Gemini models to
generate text. For more information on supported models, see the
ML.GENERATE_TEXT syntax.
ML.TRANSCRIBE: you use the default model of the
Speech-to-Text API. Using the Document AI API
gives you access to transcription with the
Chirp speech model.
Supported tasks
Supported tasks are as follows:
ML.GENERATE_TEXT: you can perform audio transcription and natural
language processing (NLP) tasks.
ML.TRANSCRIBE: you can perform audio transcription.
Pricing
Pricing is as follows:
ML.GENERATE_TEXT: for pricing of the Vertex AI models that
you use with this function, see
Vertex AI pricing.
Supervised tuning of supported models is charged at dollars per node hour.
For more information, see
Vertex AI custom training pricing.
ML.TRANSCRIBE: For pricing of the Cloud AI service that
you use with this function, see
Speech-to-Text API pricing.
ML.GENERATE_TEXT: 60 QPM in the default us-central1 region for
gemini-1.5-pro models, and 200 QPM in the default us-central1 region for
gemini-1.5-flash models. For more information, see
Generative AI on Vertex AI quotas.
ML.TRANSCRIBE: 900 QPM per project. For more information, see
Quotas and limits.
ML.GENERATE_TEXT: 700 input tokens, and 8196 output tokens. This output
token limit means that ML.GENERATE_TEXT has a limit of approximately
39 minutes for an individual audio clip.
ML.TRANSCRIBE: No token limit. However, this function does have a
limit of 480 minutes for an individual audio clip.
Supported languages
Supported languages are as follows:
ML.GENERATE_TEXT: supports the same languages as
Gemini.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003e\u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e is best suited for transcribing audio clips of 10 minutes or less and can also perform natural language processing (NLP) tasks, offering a lower cost when using the \u003ccode\u003egemini-1.5-flash\u003c/code\u003e model.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003eML.TRANSCRIBE\u003c/code\u003e is preferred for transcribing audio clips longer than 10 minutes and supports a wider array of languages compared to \u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e supports supervised tuning for certain models, whereas \u003ccode\u003eML.TRANSCRIBE\u003c/code\u003e does not offer this capability.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e has token limits for input and output, while \u003ccode\u003eML.TRANSCRIBE\u003c/code\u003e has no token limit but is limited to 480 minutes per individual audio clip.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003eML.TRANSCRIBE\u003c/code\u003e has a much higher query per minute limit than the \u003ccode\u003egemini-1.5-pro\u003c/code\u003e model in the \u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e function, whereas the \u003ccode\u003egemini-1.5-flash\u003c/code\u003e model is higher.\u003c/p\u003e\n"]]],[],null,["# Choose a transcription function\n===============================\n\nThis document provides a comparison of the transcription functions\navailable in BigQuery ML, which are\n[`ML.GENERATE_TEXT`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-text)\nand\n[`ML.TRANSCRIBE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-process-document).\n\nYou can use the information in this document to help you decide which function\nto use in cases where the functions have overlapping capabilities.\n\nAt a high level, the difference between these functions is as follows:\n\n- `ML.GENERATE_TEXT` is a good choice for transcription of audio clips that are\n 10 minutes or shorter, and you can also use it to perform natural language\n processing (NLP) tasks. Audio transcription with `ML.GENERATE_TEXT` is less\n expensive than with `ML.TRANSCRIBE` when you use the `gemini-1.5-flash` model.\n\n- `ML.TRANSCRIBE` is a good choice for performing transcription on audio\n clips that are longer than 10 minutes. It also supports a wider range of\n languages than `ML.GENERATE_TEXT`.\n\nSupported models\n----------------\n\nSupported models are as follows:\n\n- `ML.GENERATE_TEXT`: you can use a subset of the Vertex AI [Gemini](/vertex-ai/generative-ai/docs/learn/models#gemini-models) models to generate text. For more information on supported models, see the [`ML.GENERATE_TEXT` syntax](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-text#syntax).\n- `ML.TRANSCRIBE`: you use the default model of the [Speech-to-Text API](/speech-to-text). Using the Document AI API gives you access to transcription with the [Chirp speech model](/speech-to-text/v2/docs/chirp-model).\n\nSupported tasks\n---------------\n\nSupported tasks are as follows:\n\n- `ML.GENERATE_TEXT`: you can perform audio transcription and natural language processing (NLP) tasks.\n- `ML.TRANSCRIBE`: you can perform audio transcription.\n\nPricing\n-------\n\nPricing is as follows:\n\n- `ML.GENERATE_TEXT`: for pricing of the Vertex AI models that you use with this function, see [Vertex AI pricing](/vertex-ai/generative-ai/pricing). Supervised tuning of supported models is charged at dollars per node hour. For more information, see [Vertex AI custom training pricing](/vertex-ai/pricing#custom-trained_models).\n- `ML.TRANSCRIBE`: For pricing of the Cloud AI service that you use with this function, see [Speech-to-Text API pricing](/speech-to-text/pricing).\n\nSupervised tuning\n-----------------\n\nSupervised tuning support is as follows:\n\n- `ML.GENERATE_TEXT`: [supervised tuning](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model#supervised_tuning) is supported for some models.\n- `ML.TRANSCRIBE`: supervised tuning isn't supported.\n\nQueries per minute (QPM) limit\n------------------------------\n\nQPM limits are as follows:\n\n- `ML.GENERATE_TEXT`: 60 QPM in the default `us-central1` region for `gemini-1.5-pro` models, and 200 QPM in the default `us-central1` region for `gemini-1.5-flash` models. For more information, see [Generative AI on Vertex AI quotas](/vertex-ai/generative-ai/docs/quotas).\n- `ML.TRANSCRIBE`: 900 QPM per project. For more information, see [Quotas and limits](/speech-to-text/quotas).\n\nTo increase your quota, see\n[Request a quota adjustment](/docs/quotas/help/request_increase).\n\nToken limit\n-----------\n\nToken limits are as follows:\n\n- `ML.GENERATE_TEXT`: 700 input tokens, and 8196 output tokens. This output token limit means that `ML.GENERATE_TEXT` has a limit of approximately 39 minutes for an individual audio clip.\n- `ML.TRANSCRIBE`: No token limit. However, this function does have a limit of 480 minutes for an individual audio clip.\n\nSupported languages\n-------------------\n\nSupported languages are as follows:\n\n- `ML.GENERATE_TEXT`: supports the same languages as [Gemini](/vertex-ai/generative-ai/docs/learn/models#languages-gemini).\n- `ML.TRANSCRIBE`: supports all of the [Speech-to-Text supported languages](/speech-to-text/docs/speech-to-text-supported-languages).\n\nRegion availability\n-------------------\n\nRegion availability is as follows:\n\n- `ML.GENERATE_TEXT`: available in all Generative AI for Vertex AI [regions](/vertex-ai/generative-ai/docs/learn/locations#available-regions).\n- `ML.TRANSCRIBE`: available in the `EU` and `US` [multi-regions](/bigquery/docs/locations#multi-regions) for all speech recognizers."]]