Dataproc Serverless 구성요소를 사용하면 Vertex AI Pipelines 내의 파이프라인에서 Apache Spark 일괄 워크로드를 실행할 수 있습니다.
Dataproc Serverless는 관리형 컴퓨팅 인프라에서 일괄 워크로드를 실행하고 필요에 따라 리소스를 자동 확장합니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-06-23(UTC)"],[],[],null,["The Serverless for Apache Spark components let you run Apache Spark batch\nworkloads from a pipeline within Vertex AI Pipelines.\nServerless for Apache Spark runs the batch workloads on a managed compute\ninfrastructure, autoscaling resources as needed.\n\nLearn more about [Google Cloud Serverless for Apache Spark](/dataproc-serverless/docs/overview) and [supported Spark workloads](/dataproc-serverless/docs/overview#for_spark_workload_capabilities).\n\nIn Serverless for Apache Spark, a `Batch` resource represents a batch workload.\nThe Google Cloud SDK includes the following operators to\ncreate `Batch` resources and monitor their execution:\n\n\n- [`DataprocPySparkBatchOp`](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.19.0/api/v1/dataproc.html#v1.dataproc.DataprocPySparkBatchOp)\n- [`DataprocSparkBatchOp`](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.19.0/api/v1/dataproc.html#v1.dataproc.DataprocSparkBatchOp)\n- [`DataprocSparkRBatchOp`](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.19.0/api/v1/dataproc.html#v1.dataproc.DataprocSparkRBatchOp)\n- [`DataprocSparkSqlBatchOp`](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.19.0/api/v1/dataproc.html#v1.dataproc.DataprocSparkSqlBatchOp)\n\n\u003cbr /\u003e\n\nAPI reference\n\n- For component reference, see the\n [Google Cloud SDK reference for Google Cloud Serverless for Apache Spark components](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.19.0/api/v1/dataproc.html) .\n\n- For Serverless for Apache Spark resource reference, see the following API\n reference page:\n\n - [`Batch`](/dataproc-serverless/docs/reference/rest/v1/projects.locations.batches#resource:-batch) resource\n\nTutorials\n\n- [Get started with Google Cloud Serverless for Apache Spark pipeline components](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/ml_ops/stage3/get_started_with_dataproc_serverless_pipeline_components.ipynb)\n\nVersion history and release notes\n\nTo learn more about the version history and changes to the Google Cloud Pipeline Components SDK, see the [Google Cloud Pipeline Components SDK Release Notes](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-2.19.0/release.html).\n\nTechnical support contacts\n\nIf you have any questions, reach out to\n[kfp-dataproc-components@google.com](mailto: kfp-dataproc-components@google.com)."]]