Menggunakan Datastream untuk menyerap data ke dalam tabel berpartisi di BigQuery
Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Anda mungkin perlu mempartisi tabel BigQuery menjadi segmen yang lebih kecil untuk meningkatkan performa kueri dan mengontrol biaya. Karena Datastream tidak mendukung tabel partisi di BigQuery, Anda harus menambahkan partisi secara manual sebelum memulai streaming. Untuk informasi umum tentang partisi di BigQuery, lihat Pengantar tabel berpartisi.
Membuat partisi tabel di BigQuery
Untuk mempartisi tabel di BigQuery, gunakan salah satu opsi yang dijelaskan di bagian berikut, bergantung pada kasus penggunaan Anda.
Opsi 1: Tabel sudah ada di BigQuery dan disertakan dalam aliran data
Kecualikan tabel dari konfigurasi sumber aliran data Anda. Untuk informasi
selengkapnya tentang cara menyertakan dan mengecualikan objek dari konfigurasi sumber,
lihat Mengonfigurasi database sumber.
Tunggu beberapa menit untuk memastikan bahwa Datastream telah selesai memproses
semua peristiwa untuk tabel.
Buat tabel berpartisi di BigQuery.
Jika Anda ingin mempertahankan data yang sudah ada di tabel BigQuery
asli, beri tabel nama sementara yang berbeda.
Salin data dari tabel asli ke tabel berpartisi baru.
Hapus atau ganti nama tabel asli.
Ubah nama sementara tabel baru menjadi nama tabel asli.
Tambahkan tabel sumber ke konfigurasi aliran data Anda.
Opsi 2: Tabel tidak ada di BigQuery
Buat tabel di BigQuery menggunakan salah satu pendekatan berikut:
Buat tabel BigQuery yang kompatibel dengan Datastream secara manual. Misalnya, jika ingin membuat tabel dan mempartisi data berdasarkan kolom TIMESTAMP, Anda dapat menggunakan kueri yang mirip dengan berikut:
Setelah Anda membuat tabel berpartisi, pastikan nilai max_staleness-nya ditetapkan sesuai dengan persyaratan Anda. Jika Anda tidak menetapkan nilai, nilai default
0 akan ditetapkan. Membiarkan nilai ini di 0 akan memastikan data terbaru, tetapi
akan menimbulkan biaya yang signifikan. Untuk mengetahui informasi tentang cara menemukan nilai optimal
untuk tabel Anda, lihat Menggunakan tabel BigQuery dengan opsi max_staleness.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[[["\u003cp\u003eBigQuery table partitioning is necessary for improved query performance and cost control, but Datastream does not automatically support it.\u003c/p\u003e\n"],["\u003cp\u003eFor tables already in BigQuery and included in a stream, you must exclude the table from the stream, create a partitioned table, copy data, and then re-add the table to the stream's configuration.\u003c/p\u003e\n"],["\u003cp\u003eIf the table is not yet in BigQuery, you can create a partitioned table manually or using the BigQuery Migration Toolkit, ensuring to set the \u003ccode\u003emax_staleness\u003c/code\u003e value, and then add it to your stream configuration.\u003c/p\u003e\n"],["\u003cp\u003eWhen creating partitioned tables manually, ensure that you define the correct data types and specify the \u003ccode\u003ePARTITION BY\u003c/code\u003e clause in your query, and create your primary key.\u003c/p\u003e\n"]]],[],null,["# Use Datastream to ingest data into partitioned tables in BigQuery\n\nYou might need to partition your BigQuery tables into smaller segments to\nimprove query performance and control costs. Because Datastream doesn't\nsupport partitioning tables in BigQuery, you need to manually add the\npartitions before starting your stream. For general information about partitioning\nin BigQuery, see [Introduction to partitioned tables](/bigquery/docs/partitioned-tables).\n\nPartition tables in BigQuery\n----------------------------\n\nTo partition your tables in BigQuery, use one of the options described\nin the sections that follow, depending on your use case.\n\n### Option 1: The table already exists in BigQuery and is included in a stream\n\n1. Exclude the table from the source configuration of your stream. For more information about including and excluding objects from your source configuration, see [Configure source databases](/datastream/docs/create-a-stream#configuresourcedb).\n2. Wait a few minutes to ensure that Datastream has completed processing all events for the table.\n3. [Create your partitioned table in BigQuery](/bigquery/docs/creating-partitioned-tables). If you want to keep the data that was already in the original BigQuery table, give the table a different, temporary name.\n4. Copy the data from the original table to the new partitioned table.\n5. Drop or rename the original table.\n6. Change the temporary name of the new table to the name of the original table.\n7. Add the source table to the configuration of your stream.\n\n### Option 2: The table doesn't exist in BigQuery\n\n1. Create the table in BigQuery using one of the following approaches:\n\n - Use the [BigQuery Migration Toolkit](/datastream/docs/best-practices-migration-toolkit).\n - Manually create a Datastream-compatible BigQuery table. For\n example, if you want to create a table and partition the data based on the\n `TIMESTAMP` column, you can use a query similar to the following:\n\n CREATE TABLE dataset.partitioned_table (\n 'id' INT64,\n 'name' STRING\n 'update_date' DATETIME,\n 'datastream_metadata' STRUCT\u003c'uuid' STRING, 'source_timestamp' INT64\u003e,\n PRIMARY KEY ('id') NOT ENFORCED\n )\n PARTITION BY TIMESTAMP(update_date)\n\n2. After you create the partitioned table, make sure that its `max_staleness` value\n is set according to your requirements. If you don't set the value, the default\n value of `0` is set. Leaving this value at `0` ensures the freshest data, but\n incurs a significant cost. For information about how to find the optimal value\n for your table, see [Use BigQuery tables with the `max_staleness` option](/datastream/docs/destination-bigquery#use-max-staleness).\n\n3. [Add the source table to the configuration of your stream](/datastream/docs/create-a-stream#configuresourcedb).\n\n4. Optionally, if you've set manual backfill for the stream, initiate backfill\n for the table."]]