To set up and run Dataproc workloads and jobs, use Dataproc templates on GitHub
Templates are provided in the following language and execution environments:
- Airflow orchestration templates: Run Spark jobs from DAGs in Airflow.
- Java templates: Run Spark batch workloads or jobs on Google Cloud Serverless for Apache Spark or an existing Dataproc cluster.
- Python templates: Run PySpark batch workloads on Google Cloud Serverless for Apache Spark.
- Notebook templates: Run Spark jobs using Vertex AI notebooks.