이 페이지에서는 BigQuery에 대한 Datastream 복제를 설정했지만 대상 데이터 세트를 잘못된 리전에 구성할 때의 권장사항을 설명합니다. 그런 다음 소스 데이터베이스의 모든 데이터를 BigQuery로 다시 동기화하지 않고 데이터 세트를 다른 리전(또는 멀티 리전)으로 이동하려고 합니다.
시작하기 전에
데이터 마이그레이션을 다른 리전으로 시작하기 전에 다음을 고려합니다.
마이그레이션하는 데 시간이 걸리며 작업 중 스트림을 일시적으로 일시중지해야 합니다. 데이터 무결성을 유지하려면 스트림이 일시중지될 때 소스 데이터베이스에서 변경 로그를 유지해야 합니다. 스트림 일시중지 기간을 추정하려면 데이터 세트의 max_staleness 값과 가장 오래 실행되는 병합 작업을 결합합니다.
완전히 동일한 구성이지만 새로운 BigQuery 대상 위치를 사용하여 새 스트림을 만듭니다.
새 스트림을 시작합니다.
중복 이벤트가 복제되지 않도록 하려면 특정 위치에서 스트림을 시작합니다.
MySQL 및 Oracle 소스의 경우: 원본 스트림의 로그를 검사하고 스트림이 성공적으로 읽은 마지막 위치를 찾아 로그 위치를 식별할 수 있습니다. 특정 위치에서 스트림을 시작하는 방법에 관한 자세한 내용은 스트림 관리를 참고하세요.
PostgreSQL 소스: 새 스트림이 복제 슬롯의 첫 번째 로그 시퀀스 번호(LSN)의 변경사항을 읽기 시작합니다. 원본 스트림에서 이러한 변경사항 중 일부를 이미 처리했을 수 있기 때문에 복제 슬롯의 포인터를 Datastream이 읽은 마지막 LSN으로 수동 변경합니다.
이 LSN은 Datastream 소비자 로그에서 확인할 수 있습니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[[["\u003cp\u003eThis guide details how to migrate a BigQuery dataset to a new region without a full data re-synchronization when using Datastream replication.\u003c/p\u003e\n"],["\u003cp\u003eThe migration process involves creating a dataset replica in the new region, temporarily pausing the Datastream stream, and monitoring the data transfer progress.\u003c/p\u003e\n"],["\u003cp\u003eBefore initiating the migration, you must estimate the required stream pause duration based on the dataset's \u003ccode\u003emax_staleness\u003c/code\u003e and the merge operation time, while ensuring the source database retains change logs.\u003c/p\u003e\n"],["\u003cp\u003eOnce the replica's data is consistent and the stream is paused, the replica is promoted to the primary dataset and a new stream is created with the correct BigQuery destination.\u003c/p\u003e\n"],["\u003cp\u003eUsers should also ensure sufficient BigQuery resources and permissions are available in the destination region before commencing the dataset migration.\u003c/p\u003e\n"]]],[],null,["# Migrate a CDC table to another region\n\nThis page describes best practices for a use case where you've set up\nDatastream replication to BigQuery but configured the\ndestination dataset in an incorrect region. You then want to move the dataset to\nanother region (or multi-region) without having to re-synchronise all of the data\nfrom the source database to BigQuery.\n\n\u003cbr /\u003e\n\n| Querying the secondary region during the migration procedure might return incorrect or incomplete results. For more information about the limitations related to the migration procedure described on this page, see [Cross-region dataset replication](/bigquery/docs/data-replication#limitations).\n\n\u003cbr /\u003e\n\nBefore you begin\n----------------\n\nBefore you start migrating your data to another region, consider the\nfollowing:\n\n- Migration takes time, and you must temporarily pause the stream during the operation. To maintain data integrity, the source database must retain the change logs when the stream is paused. To estimate how long to pause the stream, combine the value of `max_staleness` in the dataset and the longest-running merge operation:\n - For information about how long it might take for merge operations to finish, see [Recommended table `max_staleness` value](/bigquery/docs/change-data-capture#recommended-max-staleness).\n - To find the maximum `max_staleness` in the dataset, see [Determine the current `max_staleness` value of a table](/bigquery/docs/change-data-capture#determine-max-staleness) and adjust the query to your specific needs.\n - If the estimated pause is too long for your source database to support, you might want to consider temporarily reducing the value of `max_staleness` for the tables in the dataset.\n- Verify that the user performing the migration has sufficient BigQuery resources in the destination region (query reservation and background reservation). For more information about reservations, see [Reservation assignments](/bigquery/docs/reservations-intro#assignments).\n- Verify that the user performing the migration has sufficient permissions to perform this operation, such as [Identity and Access Management (IAM)](/iam) controls or [VPC Service Controls](/security/vpc-service-controls).\n\nMigration steps\n---------------\n\nTo initiate [dataset migration](/bigquery/docs/data-replication#migrate_datasets),\nuse BigQuery data replication:\n\n1. In the Google Cloud console, go to the **BigQuery Studio** page.\n\n [Go to BigQuery Studio](https://console.cloud.google.com/bigquery)\n2. Create a BigQuery dataset replica in the new region:\n\n ALTER SCHEMA \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eDATASET_NAME\u003c/span\u003e\u003c/var\u003e\n ADD REPLICA '\u003cvar translate=\"no\"\u003eNEW_REGION\u003c/var\u003e'\n OPTIONS(location='\u003cvar translate=\"no\"\u003eNEW_REGION\u003c/var\u003e');\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eDATASET_NAME\u003c/var\u003e: the name of the dataset that you want to create.\n - \u003cvar translate=\"no\"\u003eNEW_REGION\u003c/var\u003e: the name of the region where you want to create your dataset. For example, `region-us`.\n3. Monitor the migration progress, and wait until the copy watermark in the\n replica is within a few minutes of the primary. You can run this query on\n the [BigQuery INFORMATION_SCHEMA](/bigquery/docs/information-schema-schemata-replicas#schema)\n to check the migration progress:\n\n SELECT\n catalog_name as project_id,\n schema_name as dataset_name,\n replication_time as dataset_replica_staleness\n FROM\n '\u003cvar translate=\"no\"\u003eNEW_REGION\u003c/var\u003e'.INFORMATION_SCHEMA.SCHEMATA_REPLICAS\n WHERE\n catalog_name = \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003ePROJECT_ID\u003c/span\u003e\u003c/var\u003e\n AND schema_name = \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eDATASET_NAME\u003c/span\u003e\u003c/var\u003e\n AND location = \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eNEW_REGION\u003c/span\u003e\u003c/var\u003e;\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the ID of your Google Cloud project.\n - \u003cvar translate=\"no\"\u003eDATASET_NAME\u003c/var\u003e: the name of your dataset.\n - \u003cvar translate=\"no\"\u003eDATASET_REPLICA_STALENESS\u003c/var\u003e: the staleness configuration of the tables in the dataset replica that you created.\n - \u003cvar translate=\"no\"\u003eNEW_REGION\u003c/var\u003e: the region where you created your dataset.\n4. Pause the existing Datastream stream. For more information, see\n [Pause the stream](/datastream/docs/run-a-stream#pauseastream).\n\n5. Wait for the stream to drain and take note of the time when the stream entered the\n `PAUSED` state.\n\n6. Confirm that the latest CDC changes have been applied to the BigQuery\n table by checking the [`upsert_stream_apply_watermark`](/bigquery/docs/change-data-capture#monitor_table_upsert_operation_progress)\n for the table. Run the following query and ensure that the watermark timestamp\n is 10 minutes later then when the stream was paused:\n\n SELECT table_name, upsert_stream_apply_watermark\n FROM \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eDATASET_NAME\u003c/span\u003e\u003c/var\u003e.INFORMATION_SCHEMA.TABLES\n\n To run the query only for a specific table, add the following `WHERE` clause: \n\n WHERE table_name = '\u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e'\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eDATASET_NAME\u003c/var\u003e: the name of your dataset.\n - \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e: optional. The table for which you want to check the `upsert_stream_apply_watermark`.\n7. Use the query from step 3 to verify that the new region copy watermark is\n later than the `upsert_stream_apply_watermark` captured in step 6.\n\n8. Optionally, manually compare several tables in the primary dataset in the\n original region with the replica in the new region to verify that all data\n is correctly copied.\n\n9. Promote the BigQuery dataset replica by running the following\n command in BigQuery Studio:\n\n ALTER SCHEMA \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eDATASET_NAME\u003c/span\u003e\u003c/var\u003e\n SET OPTIONS(primary_replica = '\u003cvar translate=\"no\"\u003eNEW_REGION\u003c/var\u003e');\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eDATASET_NAME\u003c/var\u003e: the name of your dataset.\n - \u003cvar translate=\"no\"\u003eNEW_REGION\u003c/var\u003e: the region where you created your dataset.\n10. Optionally, if you no longer need the original dataset (now the replica), and\n don't want to incur extra charges, then go to BigQuery Studio and drop\n the original BigQuery dataset:\n\n ALTER SCHEMA \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eDATASET_NAME\u003c/span\u003e\u003c/var\u003e DROP REPLICA IF EXISTS \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-n\"\u003eORIGINAL_REGION\u003c/span\u003e\u003c/var\u003e;\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eDATASET_NAME\u003c/var\u003e: the name of the original dataset.\n - \u003cvar translate=\"no\"\u003eORIGINAL_REGION\u003c/var\u003e: the region of the original dataset.\n11. Create a new stream with the exact same configuration but with new BigQuery\n destination location.\n\n12. Start the new stream.\n\n To prevent replicating duplicate events, start\n the stream from a specific position:\n - For MySQL and Oracle sources: you can identify the log position by examining the logs of the original stream and finding the last position from which the stream read successfully. For information about starting the stream from a specific position, see [Manage streams](/datastream/docs/manage-streams#startastreamfromspecific).\n - For PostgreSQL sources: the new stream starts reading changes from the first log sequence number (LSN) in the replication slot. Because the original stream might have already processed some of these changes, manually change the pointer of the replication slot to the last LSN from which Datastream read. You can find this LSN in the Datastream consumer logs.\n13. Optionally, delete the original stream."]]