이 문서의 정보를 사용하여 함수에 겹치는 기능이 있는 경우에 사용할 함수를 결정할 수 있습니다.
간략한 이러한 함수의 차이점은 다음과 같습니다.
ML.GENERATE_TEXT는 10분 이하의 오디오 클립 스크립트를 작성하는 데 적합하며 자연어 처리 (NLP) 태스크를 실행하는 데도 사용할 수 있습니다. gemini-1.5-flash 모델을 사용하는 경우 ML.GENERATE_TEXT를 사용한 오디오 스크립트 작성은 ML.TRANSCRIBE를 사용한 것보다 비용이 적게 듭니다.
ML.TRANSCRIBE는 10분이 넘는 오디오 클립의 스크립트를 작성하는 데 적합합니다. 또한 ML.GENERATE_TEXT보다 더 다양한 언어를 지원합니다.
지원되는 모델
지원되는 모델은 다음과 같습니다.
ML.GENERATE_TEXT: Vertex AI Gemini 모델의 하위 집합을 사용하여 텍스트를 생성할 수 있습니다. 지원되는 모델에 대한 자세한 내용은 ML.GENERATE_TEXT 구문을 참조하세요.
ML.GENERATE_TEXT: gemini-1.5-pro 모델의 경우 기본 us-central1 리전에서 60QPM, gemini-1.5-flash 모델의 경우 기본 us-central1 리전에서 200QPM이 지원됩니다. 자세한 내용은 Vertex AI의 생성형 AI 할당량을 참조하세요.
ML.TRANSCRIBE: 프로젝트당 900QPM 자세한 내용은 할당량 및 한도를 참조하세요.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[[["\u003cp\u003e\u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e is best suited for transcribing audio clips of 10 minutes or less and can also perform natural language processing (NLP) tasks, offering a lower cost when using the \u003ccode\u003egemini-1.5-flash\u003c/code\u003e model.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003eML.TRANSCRIBE\u003c/code\u003e is preferred for transcribing audio clips longer than 10 minutes and supports a wider array of languages compared to \u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e supports supervised tuning for certain models, whereas \u003ccode\u003eML.TRANSCRIBE\u003c/code\u003e does not offer this capability.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e has token limits for input and output, while \u003ccode\u003eML.TRANSCRIBE\u003c/code\u003e has no token limit but is limited to 480 minutes per individual audio clip.\u003c/p\u003e\n"],["\u003cp\u003e\u003ccode\u003eML.TRANSCRIBE\u003c/code\u003e has a much higher query per minute limit than the \u003ccode\u003egemini-1.5-pro\u003c/code\u003e model in the \u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e function, whereas the \u003ccode\u003egemini-1.5-flash\u003c/code\u003e model is higher.\u003c/p\u003e\n"]]],[],null,["# Choose a transcription function\n===============================\n\nThis document provides a comparison of the transcription functions\navailable in BigQuery ML, which are\n[`ML.GENERATE_TEXT`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-text)\nand\n[`ML.TRANSCRIBE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-process-document).\n\nYou can use the information in this document to help you decide which function\nto use in cases where the functions have overlapping capabilities.\n\nAt a high level, the difference between these functions is as follows:\n\n- `ML.GENERATE_TEXT` is a good choice for transcription of audio clips that are\n 10 minutes or shorter, and you can also use it to perform natural language\n processing (NLP) tasks. Audio transcription with `ML.GENERATE_TEXT` is less\n expensive than with `ML.TRANSCRIBE` when you use the `gemini-1.5-flash` model.\n\n- `ML.TRANSCRIBE` is a good choice for performing transcription on audio\n clips that are longer than 10 minutes. It also supports a wider range of\n languages than `ML.GENERATE_TEXT`.\n\nSupported models\n----------------\n\nSupported models are as follows:\n\n- `ML.GENERATE_TEXT`: you can use a subset of the Vertex AI [Gemini](/vertex-ai/generative-ai/docs/learn/models#gemini-models) models to generate text. For more information on supported models, see the [`ML.GENERATE_TEXT` syntax](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-text#syntax).\n- `ML.TRANSCRIBE`: you use the default model of the [Speech-to-Text API](/speech-to-text). Using the Document AI API gives you access to transcription with the [Chirp speech model](/speech-to-text/v2/docs/chirp-model).\n\nSupported tasks\n---------------\n\nSupported tasks are as follows:\n\n- `ML.GENERATE_TEXT`: you can perform audio transcription and natural language processing (NLP) tasks.\n- `ML.TRANSCRIBE`: you can perform audio transcription.\n\nPricing\n-------\n\nPricing is as follows:\n\n- `ML.GENERATE_TEXT`: for pricing of the Vertex AI models that you use with this function, see [Vertex AI pricing](/vertex-ai/generative-ai/pricing). Supervised tuning of supported models is charged at dollars per node hour. For more information, see [Vertex AI custom training pricing](/vertex-ai/pricing#custom-trained_models).\n- `ML.TRANSCRIBE`: For pricing of the Cloud AI service that you use with this function, see [Speech-to-Text API pricing](/speech-to-text/pricing).\n\nSupervised tuning\n-----------------\n\nSupervised tuning support is as follows:\n\n- `ML.GENERATE_TEXT`: [supervised tuning](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model#supervised_tuning) is supported for some models.\n- `ML.TRANSCRIBE`: supervised tuning isn't supported.\n\nQueries per minute (QPM) limit\n------------------------------\n\nQPM limits are as follows:\n\n- `ML.GENERATE_TEXT`: 60 QPM in the default `us-central1` region for `gemini-1.5-pro` models, and 200 QPM in the default `us-central1` region for `gemini-1.5-flash` models. For more information, see [Generative AI on Vertex AI quotas](/vertex-ai/generative-ai/docs/quotas).\n- `ML.TRANSCRIBE`: 900 QPM per project. For more information, see [Quotas and limits](/speech-to-text/quotas).\n\nTo increase your quota, see\n[Request a quota adjustment](/docs/quotas/help/request_increase).\n\nToken limit\n-----------\n\nToken limits are as follows:\n\n- `ML.GENERATE_TEXT`: 700 input tokens, and 8196 output tokens. This output token limit means that `ML.GENERATE_TEXT` has a limit of approximately 39 minutes for an individual audio clip.\n- `ML.TRANSCRIBE`: No token limit. However, this function does have a limit of 480 minutes for an individual audio clip.\n\nSupported languages\n-------------------\n\nSupported languages are as follows:\n\n- `ML.GENERATE_TEXT`: supports the same languages as [Gemini](/vertex-ai/generative-ai/docs/learn/models#languages-gemini).\n- `ML.TRANSCRIBE`: supports all of the [Speech-to-Text supported languages](/speech-to-text/docs/speech-to-text-supported-languages).\n\nRegion availability\n-------------------\n\nRegion availability is as follows:\n\n- `ML.GENERATE_TEXT`: available in all Generative AI for Vertex AI [regions](/vertex-ai/generative-ai/docs/learn/locations#available-regions).\n- `ML.TRANSCRIBE`: available in the `EU` and `US` [multi-regions](/bigquery/docs/locations#multi-regions) for all speech recognizers."]]