Google Cloud Serverless for Apache Spark components

The Serverless for Apache Spark components let you run Apache Spark batch workloads from a pipeline within Vertex AI Pipelines. Serverless for Apache Spark runs the batch workloads on a managed compute infrastructure, autoscaling resources as needed.

Learn more about Google Cloud Serverless for Apache Spark and supported Spark workloads.

In Serverless for Apache Spark, a Batch resource represents a batch workload. The Google Cloud SDK includes the following operators to create Batch resources and monitor their execution:

API reference

For component reference, see the Google Cloud SDK reference for Google Cloud Serverless for Apache Spark components .
For Serverless for Apache Spark resource reference, see the following API reference page:
- Batch resource

Tutorials

Get started with Google Cloud Serverless for Apache Spark pipeline components

Version history and release notes

To learn more about the version history and changes to the Google Cloud Pipeline Components SDK, see the Google Cloud Pipeline Components SDK Release Notes.

Technical support contacts

If you have any questions, reach out to kfp-dataproc-components@google.com.