이 문서에서는 BigQuery ML MLOps 워크플로를 관리하기 위한 ML 파이프라인을 빌드하는 데 사용할 수 있는 서비스를 간략하게 설명합니다.
ML 파이프라인은 일련의 파이프라인 태스크로 구성된 MLOps 워크플로를 나타냅니다. 파이프라인 태스크마다 MLOps 워크플로에서 특정 단계를 수행하여 모델을 학습시키고 배포합니다. 각 단계를 표준화된 재사용 가능한 태스크로 분리하면 ML 사용 시 반복 가능한 프로세스를 자동화하고 모니터링할 수 있습니다.
다음 서비스를 사용하여 BigQuery ML ML 파이프라인을 만들 수 있습니다.
Vertex AI Pipelines를 사용하여 이동 및 확장 가능한 ML 파이프라인을 만듭니다.
GoogleSQL 쿼리를 사용하여 덜 복잡한 SQL 기반 ML 파이프라인을 만듭니다.
Dataform을 사용하여 더욱 복잡한 SQL 기반 ML 파이프라인이나 버전 제어를 사용해야 하는 ML 파이프라인을 만듭니다.
Vertex AI Pipelines
Vertex AI Pipelines에서 ML 파이프라인은 입력-출력 종속 항목을 통해 상호 연결된 컨테이너화된 파이프라인 태스크의 방향성 비순환 그래프(DAG)로 구조화됩니다.
각 파이프라인 태스크는 특정 입력으로 파이프라인 구성요소를 인스턴스화합니다. ML 파이프라인을 정의할 때 ML 워크플로에서 파이프라인 태스크 하나의 출력을 다음 파이프라인 태스크의 입력으로 라우팅하여 여러 파이프라인 태스크를 연결해 DAG를 형성합니다. ML 파이프라인에 대한 원래 입력을 지정된 파이프라인 태스크의 입력으로 사용할 수도 있습니다.
Google Cloud 파이프라인 구성요소 SDK의 BigQuery ML 구성요소를 사용하여 Vertex AI Pipelines에서 ML 파이프라인을 구성합니다. BigQuery ML 구성요소를 시작하려면 다음 노트북을 참조하세요.
Dataform을 사용하여 BigQuery에서 데이터 변환에 사용되는 복잡한 SQL 워크플로를 개발, 테스트, 버전 제어, 예약할 수 있습니다. Dataform을 사용하면 데이터 통합을 위한 추출, 로드, 변환(ELT) 프로세스에서 데이터 변환과 같은 태스크를 수행할 수 있습니다. 원시 데이터가 소스 시스템에서 추출되어 BigQuery로 로드된 후에 Dataform을 사용하면 데이터를 잘 정의되고 테스트를 거쳤으며 문서화된 데이터 테이블 제품군으로 변환할 수 있습니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[[["\u003cp\u003eML pipelines represent MLOps workflows, breaking them down into standardized, reusable tasks to automate and monitor processes for training and deploying models.\u003c/p\u003e\n"],["\u003cp\u003eVertex AI Pipelines allows you to create portable and extensible ML pipelines, using a directed acyclic graph (DAG) of containerized tasks with input-output dependencies.\u003c/p\u003e\n"],["\u003cp\u003eGoogleSQL queries enable the creation of SQL-based ML pipelines, including running multi-statement queries in sequence to automate tasks like creating or dropping tables, as well as implementing complex logic.\u003c/p\u003e\n"],["\u003cp\u003eDataform can be utilized to develop, test, version control, and schedule complex SQL workflows for data transformation in BigQuery, particularly useful for ML pipelines requiring version control.\u003c/p\u003e\n"],["\u003cp\u003eFor ML pipelines that involve using the \u003ccode\u003eML.GENERATE_TEXT\u003c/code\u003e function, both GoogleSQL and Dataform offer ways to handle quota errors by iteratively calling the function, enabling the ability to retry if necessary.\u003c/p\u003e\n"]]],[],null,["# ML pipelines overview\n=====================\n\nThis document provides an overview of the services you can use to build an ML\npipeline to manage your BigQuery ML\n[MLOps](/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning)\nworkflow.\n\nAn ML pipeline is a representation of an MLOps workflow that is composed of a\nseries of *pipeline tasks*. Each pipeline task performs a specific step in the\nMLOps workflow to train and deploy a model. Separating each step into a\nstandardized, reusable task lets you automate and monitor repeatable processes\nin your ML practice.\n\nYou can use any of the following services to create BigQuery ML\nML pipelines:\n\n- Use Vertex AI Pipelines to create portable, extensible ML pipelines.\n- Use GoogleSQL queries to create less complex SQL-based ML pipelines.\n- Use Dataform to create more complex SQL-based ML pipelines, or ML pipelines where you need to use version control.\n\nVertex AI Pipelines\n-------------------\n\nIn [Vertex AI Pipelines](/vertex-ai/docs/pipelines/introduction),\nan ML pipeline is structured as a directed acyclic graph (DAG) of containerized\npipeline tasks that are interconnected using input-output dependencies.\nEach [pipeline task](/vertex-ai/docs/pipelines/introduction#pipeline-task)\nis an instantiation of a\n[pipeline component](/vertex-ai/docs/pipelines/introduction#pipeline-component)\nwith specific inputs. When defining your ML pipeline, you connect multiple\npipeline tasks to form a DAG by routing the outputs of one pipeline task to the\ninputs for the next pipeline task in the ML workflow. You can also use the\noriginal inputs to the ML pipeline as the inputs for a given pipeline task.\n\nUse the\n[BigQuery ML components](/vertex-ai/docs/pipelines/bigqueryml-component)\nof the Google Cloud Pipeline Components SDK to compose ML pipelines\nin Vertex AI Pipelines. To get started with\nBigQuery ML components, see the following notebooks:\n\n- [Get started with BigQuery ML pipeline components](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/ml_ops/stage3/get_started_with_bqml_pipeline_components.ipynb)\n- [Train and evaluate a demand forecasting model](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/pipelines/google_cloud_pipeline_components_bqml_pipeline_demand_forecasting.ipynb)\n\nGoogleSQL queries\n-----------------\n\nYou can use\n[GoogleSQL procedural language](/bigquery/docs/reference/standard-sql/procedural-language)\nto execute multiple statements in a\n[multi-statement query](/bigquery/docs/multi-statement-queries). You can use a\nmulti-statement query to:\n\n- Run multiple statements in a sequence, with shared state.\n- Automate management tasks such as creating or dropping tables.\n- Implement complex logic using programming constructs such as `IF` and `WHILE`.\n\nAfter creating a multi-statement query, you can\n[save](/bigquery/docs/saved-queries-introduction) and\n[schedule](/bigquery/docs/scheduling-queries) the query to automate model\ntraining, inference, and monitoring.\n\nIf your ML pipeline includes use of the\n[`ML.GENERATE_TEXT` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-text),\nsee\n[Handle quota errors by calling `ML.GENERATE_TEXT` iteratively](/bigquery/docs/iterate-generate-text-calls) for more information on how to use SQL to\niterate through calls to the function. Calling the function\niteratively lets you address any retryable errors that occur due to exceeding\nthe [quotas and limits](/bigquery/quotas#cloud_ai_service_functions).\n\nDataform\n--------\n\nYou can use [Dataform](/dataform/docs/overview) to develop,\ntest, version control, and schedule complex SQL workflows for data\ntransformation in BigQuery. You can use Dataform for\nsuch tasks as data transformation in the Extraction, Loading, and\nTransformation (ELT) process for data integration. After raw data is extracted\nfrom source systems and loaded into BigQuery,\nDataform helps you to transform it into a well-defined, tested,\nand documented suite of data tables.\n\nIf your ML pipeline includes use of the\n[`ML.GENERATE_TEXT` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-text),\nyou can adapt the\n[`structured_table_ml.js` example library](https://github.com/dataform-co/dataform-bqml/blob/main/modules/structured_table_ml.js)\nto iterate through calls to the function. Calling the function\niteratively lets you address any retryable errors that occur due to exceeding\nthe quotas and limits that apply to the function."]]