이 문서에서는 BigQuery ML이 모델에서 사용하는 데이터를 평가하고 비교하여 머신러닝(ML) 모델을 모니터링하는 방법을 설명합니다. 여기에는 모델의 서빙 데이터를 학습 데이터와 비교하고 새 서빙 데이터를 이전에 사용한 서빙 데이터와 비교하는 작업이 포함됩니다.
데이터가 모델 성능에 영향을 미치므로 모델에서 사용하는 데이터를 이해하는 것은 ML 관점에서 중요합니다. 학습 데이터와 서빙 데이터 사이의 편차를 이해하는 것은 시간 경과에 따라 모델을 정확하게 유지하려면 특히 중요합니다. 모델은 학습 데이터와 유사한 서빙 데이터에서 가장 뛰어난 성능을 보입니다. 서빙 데이터가 모델을 학습시키는 데 사용된 데이터와 다르면 모델 자체가 변경되지 않았더라도 모델 성능이 저하될 수 있습니다.
BigQuery ML은 데이터 편향 및 데이터 드리프트에 대해 학습 및 서빙 데이터를 분석하는 데 도움이 되는 함수를 제공합니다.
데이터 비대칭은 학습 데이터의 특성 값 분포가 프로덕션의 데이터 제공과 크게 다를 때 발생합니다. 모델의 학습 통계는 모델 학습 중에 저장되므로 편향 감지를 사용하는 데 원래 학습 데이터가 필요하지 않습니다.
데이터 드리프트는 시간 경과에 따라 프로덕션의 특성 데이터 분포가 크게 변경되면 발생합니다. 드리프트 감지는 연속적인 데이터 스팬(예: 서로 다른 데이터 제공 날짜 간)에 지원됩니다. 이렇게 하면 데이터 세트가 너무 많이 분기되어 모델을 재학습하기 전에 서빙 데이터가 시간 경과에 따라 변경되는 경우 알림을 받을 수 있습니다.
ML.VALIDATE_DATA_SKEW: 두 데이터 세트 간에 비정상적인 차이를 식별하기 위해 서빙 데이터 세트의 통계를 계산한 후 BigQuery ML 모델이 학습될 때 계산된 학습 데이터 통계와 비교합니다.
성능 개선 및 비용 절감을 위해 학습 데이터의 특성 열과 일치하는 서빙 데이터의 특성 열에 대해서만 통계가 계산됩니다.
데이터 드리프트가 있는지 확인하기 위해 ML.VALIDATE_DATA_DRIFT 함수를 실행하여 서로 다른 두 개의 서빙 데이터 세트의 통계를 비교합니다. 예를 들어 현재 서빙 데이터를 테이블 스냅샷의 이전 서빙 데이터 또는 ML.FEATURES_AT_TIME 함수를 사용하여 가져올 수 있는 특정 시점에 서빙된 특성과 비교해야 할 수 있습니다.
데이터 드리프트가 있으면 근본 원인을 조사하고 학습 데이터를 적절하게 조정한 후 모델을 다시 학습시킵니다.
고급 데이터 비대칭 또는 드리프트 모니터링
이 사용 사례는 세분화된 편향 또는 드리프트 통계를 기존 모니터링 솔루션과 통합하거나 다른 목적으로 통합하려는 경우에 적합합니다.
이 사용 사례의 일반적인 단계는 다음과 같습니다.
모니터링 솔루션에 적합한 간격으로 학습 및 서빙 데이터에서 ML.TFDV_DESCRIBE 함수를 실행하고 쿼리 결과를 저장합니다. 이 단계를 통해 향후 서빙 데이터를 이전 시점의 학습 및 서빙 데이터와 비교할 수 있습니다.
학습 및 서빙 데이터 통계 또는 두 개의 서빙 데이터 통계 집합에서 ML.TFDV_VALIDATE 함수를 실행하여 각각 데이터 편향 또는 특성 드리프트를 평가합니다. 학습 및 서빙 데이터는 JSON 형식의 TensorFlow DatasetFeatureStatisticsList 프로토콜 버퍼로 제공되어야 합니다. ML.TFDV_DESCRIBE 함수를 실행하여 올바른 형식으로 프로토콜 버퍼를 생성하거나 BigQuery 외부에서 로드할 수 있습니다. 다음 예시에서는 특성 편향을 평가하는 방법을 보여줍니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[[["\u003cp\u003eBigQuery ML supports model monitoring by comparing a model's serving data to its training data, and new serving data to previously used serving data, to ensure accuracy over time.\u003c/p\u003e\n"],["\u003cp\u003eData skew, which is the difference in feature value distribution between training and serving data, and data drift, which is the change in feature data distribution over time, are two key concepts that BigQuery ML helps you detect.\u003c/p\u003e\n"],["\u003cp\u003eBigQuery ML provides several functions, including \u003ccode\u003eML.DESCRIBE_DATA\u003c/code\u003e, \u003ccode\u003eML.VALIDATE_DATA_SKEW\u003c/code\u003e, \u003ccode\u003eML.VALIDATE_DATA_DRIFT\u003c/code\u003e, \u003ccode\u003eML.TFDV_DESCRIBE\u003c/code\u003e, and \u003ccode\u003eML.TFDV_VALIDATE\u003c/code\u003e, to analyze and compare training and serving data for skew and drift.\u003c/p\u003e\n"],["\u003cp\u003eThere are three main use cases for monitoring, including basic data skew monitoring, basic data drift monitoring, and advanced data skew or drift monitoring, each with its own set of steps using the BigQuery ML monitoring functions.\u003c/p\u003e\n"],["\u003cp\u003eModel monitoring can be automated using scheduled queries in BigQuery to run monitoring functions, evaluate the results, and trigger model retraining if anomalies are found, with email notifications for alerts.\u003c/p\u003e\n"]]],[],null,["# Model monitoring overview\n=========================\n\nThis document describes how BigQuery ML supports monitoring of\nmachine learning (ML) models through evaluation and comparison the data a model\nuses. This includes comparing a model's serving data to its training data,and\ncomparing new serving data to previously used serving data.\n\nUnderstanding the data used by your models is a critical aspect of ML, because\nthis data affects model performance. Understanding any variance between\nyour training and serving data is especially important in ensuring that your\nmodels remain accurate over time. A model performs best on serving data\nthat is similar to the training data. When the serving data deviates from the\ndata used to train the model, the model's performance can deteriorate, even if\nthe model itself hasn't changed.\n\nBigQuery ML provides functions to help you analyze your training\nand serving data for *data skew* and *data drift*:\n\n- *Data skew* occurs when the distribution of feature values for training data is significantly different from serving data in production. Training statistics for the model are saved during model training, so the original training data isn't required for you to use skew detection.\n- *Data drift* occurs when feature data distribution in production changes significantly over time. Drift detection is supported for consecutive spans of data, for example, between different days of serving data. This lets you get notified if the serving data is changing over time, before the data sets diverge too much to retrain the model.\n\nUse the following functions to monitor models in BigQuery ML:\n\n- [`ML.DESCRIBE_DATA`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-describe-data): compute descriptive statistics for a set of training or serving data.\n- [`ML.VALIDATE_DATA_SKEW`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-validate-data-skew): compute the statistics for a set of serving data, and then compare them to the training data statistics that were computed when a BigQuery ML model was trained, in order to identify anomalous differences between the two data sets. Statistics are only computed for feature columns in the serving data that match feature columns in the training data, in order to achieve better performance and lower cost.\n- [`ML.VALIDATE_DATA_DRIFT`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-validate-data-drift): compute and compare the statistics for two sets of serving data in order to identify anomalous differences between the two data sets.\n- [`ML.TFDV_DESCRIBE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-tfdv-describe): compute fine-grained descriptive statistics for a set of training or serving data. This function provides the same behavior as the [TensorFlow `tfdv.generate_statistics_from_csv` API](https://www.tensorflow.org/tfx/data_validation/api_docs/python/tfdv/generate_statistics_from_csv).\n- [`ML.TFDV_VALIDATE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-tfdv-validate): compare the statistics for training and serving data statistics, or two sets of serving data statistics, in order to identify anomalous differences between the two data sets. This function provides the same behavior as the [TensorFlow `validate_statistics` API](https://www.tensorflow.org/tfx/data_validation/api_docs/python/tfdv/validate_statistics).\n\nMonitoring use cases\n--------------------\n\nThis section describes how to use the BigQuery ML model\nmonitoring functions in common monitoring use cases.\n\n### Basic data skew monitoring\n\nThis use case is appropriate when you want to quickly develop and monitor a\nmodel for data skew and don't need fine-grained skew statistics to\nintegrate with an existing monitoring solution.\n\nTypical steps for this use case are as follows:\n\n1. Run the `ML.DESCRIBE_DATA` function on your training and serving data, to make sure both data sets compare appropriately to each other and are within expected parameters.\n2. [Create a BigQuery ML model](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create) and train it on the training data.\n3. Run the `ML.VALIDATE_DATA_SKEW` function to compare the serving data statistics with the training data statistics that were computed during model creation in order to see if there's any data skew.\n4. If there is data skew, investigate the root cause, adjust the training data appropriately, and then retrain the model.\n\n### Basic data drift monitoring\n\nThis use case is appropriate when you want to quickly develop and monitor a\nmodel for data drift and don't need fine-grained drift statistics to\nintegrate with an existing monitoring solution.\n\nTypical steps for this use case are as follows:\n\n1. Run the `ML.DESCRIBE_DATA` function on your training and serving data to make sure both data sets compare appropriately to each other and are within expected parameters.\n2. [Create a BigQuery ML model](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create) and train it on the training data.\n3. Run the `ML.VALIDATE_DATA_DRIFT` function to compare the statistics for two different serving data sets in order to see if there's any data drift. For example, you might want to compare the current serving data to historical serving data from a [table snapshot](/bigquery/docs/table-snapshots-intro), or to the features served at a particular point in time, which you can get by using the [`ML.FEATURES_AT_TIME` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-feature-time).\n4. If there is data drift, investigate the root cause, adjust the training data appropriately, and then retrain the model.\n\n### Advanced data skew or drift monitoring\n\nThis use case is appropriate when you want fine-grained skew or drift statistics\nto integrate with an existing monitoring solution or for other purposes.\n\nTypical steps for this use case are as follows:\n\n1. Run the `ML.TFDV_DESCRIBE` function on your training and serving data at intervals appropriate to your monitoring solution, and [save the query results](/bigquery/docs/writing-results). This step lets you compare future serving data to training and serving data from past points in time.\n2. Run the `ML.TFDV_VALIDATE` function on your training and serving data\n statistics, or on two sets of serving data statistics, to evaluate data skew\n or feature drift, respectively. The training and serving data must be\n provided as a TensorFlow\n [`DatasetFeatureStatisticsList` protocol buffer](https://www.tensorflow.org/tfx/tf_metadata/api_docs/python/tfmd/proto/statistics_pb2/DatasetFeatureStatisticsList)\n in JSON format. You can generate a protocol buffer in the correct\n format by running the `ML.TFDV_DESCRIBE` function, or you can load it from\n outside of BigQuery. The following example shows how to evaluate\n feature skew:\n\n ```googlesql\n DECLARE stats1 JSON;\n DECLARE stats2 JSON;\n\n SET stats1 = (\n SELECT * FROM ML.TFDV_DESCRIBE(TABLE `myproject.mydataset.training`)\n );\n SET stats2 = (\n SELECT * FROM ML.TFDV_DESCRIBE(TABLE `myproject.mydataset.serving`)\n );\n\n SELECT ML.TFDV_VALIDATE(stats1, stats2, 'SKEW');\n\n INSERT `myproject.mydataset.serve_stats`\n (t, dataset_feature_statistics_list)\n SELECT CURRENT_TIMESTAMP() AS t, stats1;\n ```\n3. If there is data skew or data drift, investigate the root cause,\n adjust the training data appropriately, and then retrain the model.\n\nMonitoring visualization\n------------------------\n\n|\n| **Preview**\n|\n|\n| This product or feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA products and features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nSome monitoring functions offer integration with\n[Vertex AI model monitoring](/vertex-ai/docs/model-monitoring/overview),\nso that you can use charts and graphs to\n[analyze model monitoring function output](/vertex-ai/docs/model-monitoring/run-monitoring-job#analyze_monitoring_job_results).\n\nUsing Vertex AI visualizations offers the\nfollowing benefits:\n\n- **Interactive visualizations**: explore data distributions, skew metrics, and drift metrics by using charts and graphs in the Vertex AI console.\n- **Historical analysis**: track model monitoring results over time by using Vertex AI visualizations. This lets you identify trends and patterns in data changes so that you can proactively update and maintain models.\n- **Centralized management**: manage monitoring for all BigQuery ML and Vertex AI models in the unified Vertex AI dashboard.\n\nYou can enable visualization of the `ML.VALIDATE_DATA_DRIFT` function output\nby using that function's `MODEL` argument. You can enable visualization of the\n`ML.VALIDATE_DATA_SKEW` function output by using that function's\n`enable_visualization_link` argument.\n\nYou can only use monitoring visualization with models that are\n[registered](/bigquery/docs/managing-models-vertex#register_models) with\nVertex AI. You can register an existing model by using the\n[`ALTER MODEL` statement](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-alter-model).\n\nMonitoring automation\n---------------------\n\nYou can automate monitoring by using a\n[scheduled query](/bigquery/docs/scheduling-queries) to run the\nmonitoring function, evaluate the output, and retrain the model\nif anomalies are detected. You must enable email notifications as part of\n[setting up the scheduled query](/bigquery/docs/scheduling-queries#set_up_scheduled_queries).\n\nFor an example that shows how to automate the `ML.VALIDATE_DATA_SKEW`\nfunction, see\n[Automate skew detection](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-validate-data-skew#automate_skew_detection)."]]