잘못된 이미지 버전을 값으로 사용한 system.profile.properties.imageVersion 키로 재정의되지 않았는지 확인합니다.
저장을 클릭합니다.
Cloud Data Fusion에서 사용하는 정적 Dataproc 클러스터를 선택한 이미지 버전으로 다시 만들기
기존 Dataproc 클러스터를 Cloud Data Fusion과 함께 사용하는 경우 Dataproc 가이드에 따라 Cloud Data Fusion 버전에 대해 선택한 Dataproc 이미지 버전으로 클러스터를 다시 만듭니다.
또는 선택한 Dataproc 이미지 버전으로 새 Dataproc 클러스터를 만들고 같은 컴퓨팅 프로필 이름과 업데이트된 Dataproc 클러스터 이름으로 Cloud Data Fusion에서 컴퓨팅 프로필을 삭제하고 다시 만들 수 있습니다. 이러한 방식으로 일괄 파이프라인을 실행하면 기존 클러스터에서 실행이 완료될 수 있으며 후속 파이프라인 실행은 새 Dataproc 클러스터에서 수행됩니다. 모든 파이프라인 실행이 완료된 것을 확인한 후 이전 Dataproc 클러스터를 삭제할 수 있습니다.
해당 JSON 객체 아래에서 config >
softwareConfig > imageVersion에 있는 이미지를 확인합니다.
Dataproc 이미지를 버전 2.2 또는 2.1로 변경
Cloud Data Fusion 버전 6.9.1 이상은 Java 11에서 실행되는 Dataproc 이미지 2.1 Compute Engine을 지원합니다.
버전 6.10.0 이상에서는 이미지 2.1이 기본값입니다.
이전 이미지에서 이미지 2.2 또는 2.1로 변경하는 경우 일괄 파이프라인과 복제 작업이 성공하려면 데이터베이스 플러그인이 해당 인스턴스에서 사용하는 JDBC 드라이버가 Java 11과 호환되어야 합니다.
Dataproc 이미지 2.2 및 2.1은 Cloud Data Fusion에서 다음과 같은 제한사항이 적용됩니다.
맵리듀스 작업은 지원되지 않습니다.
인스턴스의 데이터베이스 플러그인에서 사용되는 JDBC 드라이버 버전은 Java 11을 지원하도록 업데이트해야 합니다. Dataproc 2.2, 2.1, Java 11에서 작동하는 드라이버 버전은 다음 표를 참조하세요.
JDBC 드라이버
Cloud Data Fusion 6.9.1에서 삭제된 이전 버전
Dataproc 2.2, 2.1 또는 2.0에서 작동하는 Java 8 및 Java 11 지원 버전
MySQL용 Cloud SQL JDBC 드라이버
-
1.0.16
PostgreSQL용 Cloud SQL JDBC 드라이버
-
1.0.16
Microsoft SQL Server JDBC 드라이버
Microsoft JDBC 드라이버 6.0
Microsoft JDBC 드라이버 9.4
MySQL JDBC 드라이버
5.0.8, 5.1.39
8.0.25
PostgreSQL JDBC 드라이버
9.4.1211.jre7, 9.4.1211.jre8
42.6.0.jre8
Oracle JDBC 드라이버
ojdbc7
ojdbc8(12c 이상)
Dataproc 2.2 또는 2.1 사용 시 메모리 사용량
Dataproc 2.2 또는 2.1 클러스터를 사용하는 파이프라인에서 메모리 사용량이 증가할 수 있습니다. 인스턴스를 버전 6.10 이상으로 업그레이드하고 이전 파이프라인이 메모리 문제로 인해 실패하는 경우 파이프라인의 Resources 구성에서 드라이버 및 실행자 메모리를 2048MB로 늘립니다.
또는 system.profile.properties.imageVersion 런타임 인수를 2.0-debian10으로 설정하여 Dataproc 버전을 재정의할 수 있습니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-08(UTC)"],[[["\u003cp\u003eYou can modify the Dataproc image version used by your Cloud Data Fusion instance at the instance, namespace, or pipeline level.\u003c/p\u003e\n"],["\u003cp\u003eBefore changing the Dataproc image version, it is crucial to stop all real-time pipelines and replication jobs to ensure the changes are applied correctly.\u003c/p\u003e\n"],["\u003cp\u003eThe Dataproc image version can be set through the Cloud Data Fusion web interface, in Compute Configurations, Namespace Preferences, or Pipeline Runtime Arguments.\u003c/p\u003e\n"],["\u003cp\u003eTo ensure batch pipelines and replication jobs succeed with Dataproc image 2.2 or 2.1, verify that the JDBC drivers used by the database plugins are compatible with Java 11.\u003c/p\u003e\n"],["\u003cp\u003eIf you use existing Dataproc clusters with Cloud Data Fusion, you should recreate them with the desired image version, ensuring the cluster name remains consistent for seamless operation.\u003c/p\u003e\n"]]],[],null,["# Change the Dataproc image version in Cloud Data Fusion\n\nThis page describes how to change the Dataproc image version used\nby your Cloud Data Fusion instance. You can change the image at the\ninstance, namespace, or pipeline level.\n\nBefore you begin\n----------------\n\nStop all real-time pipelines and replication jobs in the\nCloud Data Fusion instance. If a real-time pipeline or replication is\nrunning when you change the Dataproc image version, the changes\naren't applied to the pipeline execution.\n\nFor real-time pipelines, if checkpointing is enabled, stopping the\npipelines doesn't cause any data loss. For replication jobs, as long\nas the database logs are available, stopping and starting the\nreplication job doesn't cause data loss.\n**Note:** After the configuration changes are applied, **Batch pipelines** use the following updated configurations on subsequent runs. \n\n### Console\n\n1. Go to the Cloud Data Fusion **Instances** page and open the\n instance where you need to stop a pipeline.\n\n [Go to Instances](https://console.cloud.google.com/data-fusion/locations/-/instances)\n2. Open each real-time pipeline in the Pipeline Studio and click\n **Stop**.\n\n3. Open each replication job on the **Replicate** page and\n click **Stop**.\n\n### REST API\n\n- To retrieve all pipelines, use the following REST API call:\n\n GET -H \"Authorization: Bearer ${AUTH_TOKEN}\" \\\n \"${CDAP_ENDPOINT}/v3/namespaces/\u003cvar translate=\"no\"\u003eNAMESPACE_ID\u003c/var\u003e/apps\"\n\n Replace \u003cvar translate=\"no\"\u003eNAMESPACE_ID\u003c/var\u003e with the name of your\n namespace.\n- To stop a real-time pipeline, use the following REST API call:\n\n POST -H \"Authorization: Bearer ${AUTH_TOKEN}\" \\\n \"${CDAP_ENDPOINT}/v3/namespaces/\u003cvar translate=\"no\"\u003eNAMESPACE_ID\u003c/var\u003e/apps/\u003cvar translate=\"no\"\u003ePIPELINE_NAME\u003c/var\u003e/spark/DataStreamsSparkStreaming/stop\"\n\n Replace \u003cvar translate=\"no\"\u003eNAMESPACE_ID\u003c/var\u003e with the name of your\n namespace and \u003cvar translate=\"no\"\u003ePIPELINE_NAME\u003c/var\u003e with the name of the\n real-time pipeline.\n- To stop a replication job, use the following REST API call:\n\n POST -H \"Authorization: Bearer ${AUTH_TOKEN}\" \\\n \"${CDAP_ENDPOINT}/v3/namespaces/\u003cvar translate=\"no\"\u003eNAMESPACE_ID\u003c/var\u003e/apps/\u003cvar translate=\"no\"\u003eREPLICATION_JOB_NAME\u003c/var\u003e/workers/DeltaWorker/stop\"\n\n Replace \u003cvar translate=\"no\"\u003eNAMESPACE_ID\u003c/var\u003e with the name of your\n Namespace and \u003cvar translate=\"no\"\u003eREPLICATION_JOB_NAME\u003c/var\u003e with the name\n of the replication job.\n\n For more information, see [stopping real-time pipelines](/data-fusion/docs/reference/cdap-reference#stop_a_real-time_pipeline)\n and [stopping replication jobs](/data-fusion/docs/reference/replication-ref#stop-a-replication-job).\n\nCheck and override the default version of Dataproc in Cloud Data Fusion\n-----------------------------------------------------------------------\n\n1. [Go to the Cloud Data Fusion web interface](/data-fusion/docs/create-data-pipeline#navigate_the_web_interface).\n\n2. Click **System Admin \\\u003e Configuration \\\u003e System\n Preferences**.\n\n3. If a Dataproc image is not specified in System Preferences,\n or to change the preference, click **Edit System Preferences**.\n\n 1. Enter the following text in the **Key** field:\n\n `system.profile.properties.imageVersion`\n 2. Enter the chosen Dataproc image in the **Value field** ,\n such as `2.1`.\n\n 3. Click **Save \\& Close**.\n\nThis change affects the entire Cloud Data Fusion instance, including all\nits Namespaces and pipeline runs, unless the image version property is\noverridden in a Namespace, pipeline, or Runtime Argument in your instance.\n\nChange the Dataproc image version\n---------------------------------\n\nThe image version can be set in the Cloud Data Fusion web interface in\nthe Compute Configurations, Namespace Preferences, or Pipeline Runtime\nArguments.\n| **Note:** If you haven't overridden the Dataproc image version in Namespace Preferences or Pipeline Runtime Arguments, skip these steps.\n\n### Change the image in Namespace Preferences\n\nIf you have overridden the image version in your Namespace properties,\nfollow these steps:\n\n1. [Go to the Cloud Data Fusion web interface](/data-fusion/docs/create-data-pipeline#navigate_the_web_interface).\n\n2. Click **System Admin \\\u003e Configuration \\\u003e Namespaces**.\n\n3. Open each namespace and click **Preferences**.\n\n 1. Make sure that there is no override with key\n `system.profile.properties.imageVersion` with an incorrect image\n version value.\n\n 2. Click **Finish**.\n\n### Change the image in System Compute Profiles\n\n1. [Go to the Cloud Data Fusion web interface](/data-fusion/docs/create-data-pipeline#navigate_the_web_interface).\n\n2. Click **System Admin \\\u003e Configuration**.\n\n3. Click System **Compute Profiles \\\u003e Create New Profile**.\n\n4. Select the **Dataproc** provisioner.\n\n5. Create the profile for Dataproc. In the **Image Version**\n field, enter a Dataproc image version.\n\n6. Select this compute profile while running the pipeline on the **Studio**\n page. On the pipeline run page, click **Configure \\\u003e Compute\n config** and select this profile.\n\n7. Select the Dataproc profile and click **Save**.\n\n | **Note:** For more information about using image 2.2 and 2.1, which run in Java11, see [Change the Dataproc image to version 2.2 or 2.1](#change-to-dataproc-21).\n8. Click **Finish**.\n\n### Change the image in Pipeline Runtime Arguments\n\nIf you have overridden the image version with a property in the Runtime\nArguments of your pipeline, follow these steps:\n\n1. [Go to the Cloud Data Fusion web interface](/data-fusion/docs/create-data-pipeline#navigate_the_web_interface).\n\n2. Click menu\n **Menu \\\u003e List**.\n\n3. On the **List** page, select the pipeline you want to update.\n\n The pipeline opens on the **Studio** page.\n4. To expand the **Run** options, click the arrow_drop_down expander arrow.\n\n The **Runtime Arguments** window opens.\n5. Check that there is no override with the key\n `system.profile.properties.imageVersion` with an incorrect image version\n as the value.\n\n6. Click **Save**.\n\nRecreate static Dataproc clusters used by Cloud Data Fusion with chosen image version\n-------------------------------------------------------------------------------------\n\nIf you use existing Dataproc clusters with\nCloud Data Fusion, follow the [Dataproc\nguide](/dataproc/docs/guides/recreate-cluster) to recreate the clusters with the\nchosen Dataproc image version for your Cloud Data Fusion\nversion.\n| **Important:** Keep the cluster name the same.\n| **Note:** If there are any pipelines running when the cluster is being recreated, the pipelines will fail. Subsequent runs should run on the recreated cluster.\n\nAlternatively, you can create a new Dataproc cluster with the\nchosen Dataproc image version and delete and recreate the compute\nprofile in Cloud Data Fusion with the same compute profile name and\nupdated Dataproc cluster name. This way, running batch pipelines\ncan complete execution on the existing cluster and subsequent pipeline runs take\nplace on the new Dataproc cluster. You can delete the old\nDataproc cluster after you have confirmed that all pipeline runs\nhave completed.\n\nCheck that the Dataproc image version is updated\n------------------------------------------------\n\n### Console\n\n1. In the Google Cloud console, go to the Dataproc **Clusters**\n page.\n\n [Go to Clusters](https://console.cloud.google.com/dataproc/clusters)\n2. Open the **Cluster details** page for the new cluster that\n Cloud Data Fusion created when you specified the new version.\n\n The **Image version** field has the new value that you specified in\n Cloud Data Fusion.\n\n### REST API\n\n1. Get the list of clusters with their metadata:\n\n GET -H \"Authorization: Bearer ${AUTH_TOKEN}\" \\\n https://dataproc.googleapis.com/v1/projects/\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e/regions/\u003cvar translate=\"no\"\u003eREGION_ID\u003c/var\u003e/clusters\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e with the name of your namespace\n - \u003cvar translate=\"no\"\u003eREGION_ID\u003c/var\u003e with the name of the region where your clusters are located\n2. Search for the name of your pipeline (cluster name).\n\n3. Under that JSON object, see the image in `config `\u003e`\n softwareConfig `\u003e` imageVersion`.\n\nChange the Dataproc image to version 2.2 or 2.1\n-----------------------------------------------\n\nCloud Data Fusion versions 6.9.1 and later support the\nDataproc image 2.1 Compute Engine, which runs in Java 11.\nIn versions 6.10.0 and later, image 2.1 is the default.\n\nIf you change to image 2.2 or 2.1 from an earlier image, for your batch\npipelines and replication jobs to succeed, the JDBC drivers that the\ndatabase plugins use in those instances must be compatible with Java 11.\n\nDataproc image 2.2 and 2.1 have the following limitations in\nCloud Data Fusion:\n\n- Map reduce jobs aren't supported.\n- JDBC driver versions used in the database plugins in your instance must be updated to have support for Java 11. See the following table for driver versions that work with Dataproc 2.2, 2.1, and Java 11:\n\n### Memory usage when using Dataproc 2.2 or 2.1\n\nMemory usage might increase for pipelines that use Dataproc 2.2\nor 2.1 clusters. If you upgrade your instance to version 6.10 or later, and\nprevious pipelines are failing due to memory issues, increase the driver and\nexecutor memory to 2048 MB in the `Resources` configuration for the\npipeline.\n\nAlternatively, you can override the Dataproc version by setting\nthe `system.profile.properties.imageVersion` runtime argument to `2.0-debian10`."]]