Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Setelah men-deploy tugas replikasi, Anda tidak dapat mengedit atau menambahkan tabel ke
tugas tersebut. Sebagai gantinya, tambahkan tabel ke tugas replikasi baru atau duplikat.
Opsi 1: Membuat tugas replikasi baru
Menambahkan tabel ke tugas baru adalah pendekatan yang paling sederhana. Hal ini mencegah pemuatan ulang historis semua tabel dan mencegah masalah inkonsistensi data.
Kelemahannya adalah peningkatan overhead dalam mengelola beberapa
tugas replikasi dan konsumsi lebih banyak resource komputasi, karena
setiap tugas berjalan di cluster Dataproc sementara yang terpisah secara
default. Masalah yang terakhir dapat dimitigasi sampai batas tertentu dengan menggunakan cluster Dataproc statis bersama untuk kedua tugas.
Untuk mengetahui informasi selengkapnya tentang cara membuat tugas baru, lihat
Tutorial replikasi.
Opsi 2: Menghentikan tugas replikasi saat ini dan membuat duplikat
Jika Anda menduplikasi tugas replikasi untuk menambahkan tabel, pertimbangkan hal berikut:
Mengaktifkan snapshot untuk tugas duplikat akan menghasilkan pemuatan historis
semua tabel dari awal. Hal ini direkomendasikan jika Anda tidak dapat menggunakan
opsi sebelumnya, yaitu saat Anda menjalankan tugas terpisah.
Menonaktifkan snapshot untuk mencegah pemuatan historis dapat mengakibatkan hilangnya data, karena mungkin ada peristiwa yang terlewat antara saat pipeline lama berhenti dan pipeline baru dimulai. Membuat tumpang-tindih untuk mengurangi masalah ini tidak
direkomendasikan, karena juga dapat menyebabkan hilangnya data—data historis untuk tabel
baru tidak direplikasi.
Untuk membuat tugas replika duplikat, ikuti langkah-langkah berikut:
Hentikan pipeline yang ada.
Dari halaman Tugas replikasi, temukan tugas yang ingin Anda duplikasi,
klik more_vert, lalu
Duplikasikan.
Aktifkan snapshot:
Buka Konfigurasi sumber.
Di kolom Replicate existing data, pilih Yes.
Tambahkan tabel di jendela Select tables and transformations, lalu ikuti wizard untuk men-deploy pipeline replikasi.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[[["\u003cp\u003eYou cannot directly edit or add tables to an existing replication job after deployment; instead, you must create a new or duplicate job.\u003c/p\u003e\n"],["\u003cp\u003eCreating a new replication job to add tables is the preferred method as it prevents historical reloading and data inconsistency issues, but it increases overhead and resource consumption.\u003c/p\u003e\n"],["\u003cp\u003eDuplicating a replication job to add tables requires careful consideration of snapshot settings, as enabling the snapshot triggers a full historical reload, while disabling it can lead to data loss.\u003c/p\u003e\n"],["\u003cp\u003eRunning duplicate replication jobs against the same target BigQuery dataset as the original job should be avoided, as it can cause data inconsistency.\u003c/p\u003e\n"],["\u003cp\u003eUsing a shared static Dataproc cluster can help mitigate the increased compute resource usage associated with running multiple replication jobs.\u003c/p\u003e\n"]]],[],null,["# Add tables to a replication job\n\nAfter you deploy a replication job, you cannot edit or add tables to\nit. Instead, add the tables to a new or duplicate replication job.\n\nOption 1: Create a new replication job\n--------------------------------------\n\nAdding tables to a new job is the simplest approach. It prevents historical\nreloading of all the tables and prevents data inconsistency issues.\n\nThe drawbacks are the increased overhead of managing multiple\nreplication jobs and the consumption of more compute resources, as\neach job runs on a separate ephemeral Dataproc cluster by\ndefault. The latter can be mitigated to some extent by using a shared static\nDataproc cluster for both jobs.\n\nFor more information about creating new jobs, see the\n[Replication tutorials](/data-fusion/docs/how-to/using-replication).\n\nFor more information about using static Dataproc cluster in\nCloud Data Fusion, see\n[Run a pipeline against an existing Dataproc cluster](/data-fusion/docs/how-to/running-against-existing-dataproc)\n\nOption 2: Stop the current replication job and create a duplicate\n-----------------------------------------------------------------\n\nIf you duplicate the replication job to add the tables, consider the\nfollowing:\n\n- Enabling the snapshot for the duplicate job results in the historical load of\n all the tables from scratch. This is recommended if you cannot use the\n previous option, where you run separate jobs.\n\n- Disabling the snapshot to prevent the historical load can result in data\n loss, as there could be missed events between when the old pipeline stops and\n the new one starts. Creating an overlap to mitigate this issue isn't\n recommended, as it can also result in data loss---historical data for the new\n tables isn't replicated.\n\nTo create a duplicate replication job, follow these steps:\n\n1. Stop the existing pipeline.\n\n2. From the Replication jobs page, locate the job that you want to duplicate,\n click more_vert and\n **Duplicate**.\n\n3. Enable the snapshot:\n\n 1. Go to **Configure source**.\n 2. In the **Replicate existing data** field, select **Yes**.\n4. Add tables in the **Select tables and transformations** window and follow the\n wizard to deploy the replication pipeline.\n\n| **Note:** If you run a duplicate replication job against the same target BigQuery dataset as the original job, don't run the original job again, as it can cause data inconsistency.\n\nWhat's next\n-----------\n\n- Learn more about [Replication](/data-fusion/docs/concepts/replication)."]]