Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Halaman ini menunjukkan cara menyelesaikan masalah terkait penghapusan cluster Dataproc sementara di Cloud Data Fusion.
Saat Cloud Data Fusion membuat cluster Dataproc efemeral
selama penyediaan operasi pipeline, cluster akan dihapus setelah operasi
pipeline selesai. Dalam kasus yang jarang terjadi, penghapusan cluster gagal.
Sangat direkomendasikan: Upgrade ke versi Cloud Data Fusion terbaru untuk memastikan pemeliharaan cluster yang tepat.
Menetapkan Waktu Tidak Ada Aktivitas Maksimum
Untuk mengatasi masalah ini, konfigurasikan nilai Max Idle Time. Hal ini memungkinkan
Dataproc menghapus cluster secara otomatis, meskipun panggilan eksplisit
pada penyelesaian pipeline gagal.
Max Idle Time tersedia di Cloud Data Fusion versi 6.4 dan yang lebih baru.
Di Cloud Data Fusion 6.6 dan yang lebih baru, Max Idle Time ditetapkan ke 4 jam secara default.
Untuk mengganti waktu default di profil komputasi default, ikuti langkah-langkah berikut:
Buka instance di antarmuka web Cloud Data Fusion.
Klik System Admin>Configuration>System
Preferences.
Klik Edit System Preferences dan tambahkan kunci
system.profile.properties.idleTTL dan nilai, dalam format IntegerUnit,
seperti 30m.
Direkomendasikan: Untuk versi sebelum 6.6, tetapkan Max Idle Time secara manual ke 30
menit atau lebih.
Menghapus cluster secara manual
Jika Anda tidak dapat mengupgrade versi atau mengonfigurasi opsi Max Idle Time,
hapus cluster yang sudah tidak berlaku secara manual:
Dapatkan setiap project ID tempat cluster dibuat:
Dalam argumen runtime pipeline, periksa apakah
project ID Dataproc disesuaikan untuk dijalankan.
Jika project ID Dataproc tidak ditentukan secara eksplisit,
tentukan penyedia yang digunakan, lalu periksa project ID:
Dalam argumen runtime pipeline, periksa nilai system.profile.name.
Buka setelan penyedia dan periksa apakah project ID Dataproc ditetapkan. Jika setelan tidak ada atau kolom kosong, project tempat instance Cloud Data Fusion berjalan akan digunakan.
Untuk setiap project:
Buka project di Google Cloud console, lalu buka halaman Cluster Dataproc.
Urutkan cluster berdasarkan tanggal pembuatannya, dari yang terlama hingga
terbaru.
Jika panel info disembunyikan, klik Tampilkan panel info dan buka tab
Label.
Untuk setiap cluster yang tidak digunakan—misalnya, lebih dari satu hari telah
berlalu—periksa apakah cluster tersebut memiliki label versi Cloud Data Fusion. Hal ini
menunjukkan bahwa pipeline tersebut dibuat oleh Cloud Data Fusion.
Pilih kotak centang berdasarkan nama cluster, lalu klik Hapus.
Melewati penghapusan cluster
Untuk tujuan proses debug, Anda dapat menghentikan penghapusan otomatis cluster sementara.
Untuk menghentikan penghapusan, tetapkan properti Skip Cluster Deletion ke True. Anda
harus menghapus cluster secara manual setelah menyelesaikan proses debug.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[[["\u003cp\u003eThis guide addresses the issue of failed ephemeral Dataproc cluster deletions in Cloud Data Fusion, which can occur after a pipeline run.\u003c/p\u003e\n"],["\u003cp\u003eUpgrading to the latest Cloud Data Fusion version is strongly recommended to ensure automatic cluster cleanup.\u003c/p\u003e\n"],["\u003cp\u003eConfiguring the \u003ccode\u003eMax Idle Time\u003c/code\u003e setting (available in versions 6.4+) enables automatic cluster deletion by Dataproc even if the pipeline deletion fails, with a default of 4 hours in version 6.6+.\u003c/p\u003e\n"],["\u003cp\u003eIf upgrading or setting \u003ccode\u003eMax Idle Time\u003c/code\u003e isn't possible, you can manually delete stale clusters by identifying the relevant project IDs and deleting the clusters from the Dataproc Clusters page.\u003c/p\u003e\n"],["\u003cp\u003eFor debugging, the \u003ccode\u003eSkip Cluster Deletion\u003c/code\u003e property can be set to \u003ccode\u003eTrue\u003c/code\u003e to prevent cluster deletion after a pipeline run, but you must manually delete the cluster afterward.\u003c/p\u003e\n"]]],[],null,["# Troubleshoot deleting clusters\n\nThis page shows you how to resolve issues with deleting ephemeral\nDataproc clusters in Cloud Data Fusion.\n\nWhen Cloud Data Fusion creates an ephemeral Dataproc cluster\nduring pipeline run provisioning, the cluster gets deleted after the pipeline\nrun is finished. In rare cases, the cluster deletion fails.\n\n**Strongly recommended**: Upgrade to the most recent Cloud Data Fusion\nversion to ensure proper cluster maintenance.\n\nSet Max Idle Time\n-----------------\n\nTo resolve this issue, configure the **Max Idle Time** value. This lets\nDataproc delete clusters automatically, even if an explicit call\non the pipeline finish fails.\n\n`Max Idle Time` is available in Cloud Data Fusion versions 6.4 and later.\n\nIn Cloud Data Fusion 6.6 and later, **Max Idle Time** is set to 4 hours by\ndefault.\n\nTo override the default time in the default compute profile, follow these steps:\n\n1. Open the instance in the Cloud Data Fusion web interface.\n2. Click **System Admin** \\\u003e **Configuration** \\\u003e **System\n Preferences**.\n3. Click **Edit System Preferences** and add the key `system.profile.properties.idleTTL` and the value, in IntegerUnit format, such as `30m`.\n\n**Recommended** : For versions before 6.6, set `Max Idle Time` manually to 30\nminutes or greater.\n\nDelete clusters manually\n------------------------\n\nIf you cannot upgrade your version or configure the `Max Idle Time` option,\ninstead delete stale clusters manually:\n\n1. Get each project ID where the clusters were created:\n\n 1. In the pipeline's runtime arguments, check if the\n Dataproc project ID is customized for the run.\n\n 2. If a Dataproc project ID is not specified explicitly,\n determine which provisioner is used, and then check for a project ID:\n\n 1. In the pipeline runtime arguments, check the `system.profile.name`\n value.\n\n 2. Open the provisioner settings and check if the\n Dataproc project ID is set. If the setting is not\n present or the field is empty, the project that the\n Cloud Data Fusion instance is running in is used.\n\n | **Important:** Multiple pipeline runs might use different projects. Be sure to get all of the project IDs.\n2. For each project:\n\n 1. Open the project in the Google Cloud console and go to the\n Dataproc **Clusters** page.\n\n [Go to Clusters](https://console.cloud.google.com/dataproc/clusters)\n 2. Sort the clusters by the date that they were created, from oldest to\n newest.\n\n 3. If the info panel is hidden, click **Show info panel** and go to the\n **Labels** tab.\n\n 4. For every cluster that is not in use---for example, more than a day has\n elapsed---check if it has a Cloud Data Fusion version label. That\n is an indication that it was created by Cloud Data Fusion.\n\n 5. Select the checkbox by the cluster name and click **Delete**.\n\nSkip cluster deletion\n---------------------\n\nFor debugging purposes, you can stop the automatic deletion of an ephemeral\ncluster.\n\nTo stop the deletion, set the `Skip Cluster Deletion` property to `True`. You\nmust manually delete the cluster after you finish debugging."]]