Native BigQuery support for Apache Spark alongside SQL. Learn more.
Industry’s first autoscaling serverless Spark, integrated with the best of Google-native and open source tools. Develop and run Spark where you need it across all use cases, including ETL, data science, and exploration.
Benefits
Operational simplicity through serverless Spark
Write Spark applications and pipelines that autoscale without any manual infrastructure provisioning or tuning.
Flexibility of consumption
One size does not fit all. You can choose between serverless, Kubernetes clusters, and compute clusters for your Spark applications.
Key features
Google Cloud's serverless Spark accelerates data science by automating infrastructure. Focus on your code, not cluster management. Automatic scaling and seamless integration with BigQuery and Vertex AI streamline workflows, enabling faster iteration and model development. Check out the latest libraries for serverless Spark to enable more use cases with less user-configuration needed. Check out the latest code samples for data scientists, including building a pipeline for predicting customer churn using Apache Spark, XGBoost, and the Hugging Face Transformers library.
Spark for data science in one click: Data scientists can use Spark for development from Vertex AI Workbench seamlessly, with built-in security. Spark is integrated with Vertex AI's MLOps features, where users can execute Spark code through notebook executors that are integrated with Vertex AI Pipelines.
Unified SQL and Spark experience: Create and run Apache Spark code that is written in Python directly from BigQuery. You can then run and schedule these stored procedures in BigQuery using a Google Standard SQL query, similar to running SQL stored procedures.
Developers can spend all their time on code and logic, and use their chosen interface to submit Spark jobs which auto-provision and auto-scale. Read the documentation for serverless Spark.
Run auto-scaling Spark on data across Google Cloud from a single interface that has one-click access to SparkSQL, Notebooks, or PySpark. Also offers easy collaboration with the ability to save, share, search notebooks and scripts alongside data, and built-in governance across data lakes.
Ready to get started? Contact us
Spark is a trademark of The Apache Software Foundation.
Tell us what you’re solving for. A Google Cloud expert will help you find the best solution.