
Our vectorized engine is an easier way to optimizing Spark with a smarter engine that delivers over 4.3x faster Spark performance*, reducing compute costs.
*The queries are derived from the TPC-DS standard and TPC-H standard and as such are not comparable to published TPC-DS standard and TPC-H standard results, as these runs do not comply with all requirements of the TPC-DS standard and TPC-H standard specification.
Apache Spark is a trademark of The Apache Software Foundation.
Features
Experience a faster way to run Spark. Accelerate your large-scale ETL, data science, and SQL workloads over 4.3x faster than open source Apache Spark. This dramatic reduction in job runtime lowers the total cost of ownership for your Spark workloads by reducing compute time.
Discover an easier way to improve performance. Reduce spending valuable engineering cycles on optimizing Spark.
Leverage a smarter architecture. Lightning Engine automatically caches hot data in memory and utilizes high-throughput, optimized connectors for Cloud Storage and BigQuery, significantly improving I/O latency and throughput for large-scale Spark data processing.
Lightning Engine leverages a native C++ vectorized execution engine to process data in batches, dramatically improving CPU efficiency over traditional row-by-row processing. This is a core component of its breakthrough Spark performance.
Availability
| Availability | Lightning Engine is for your most demanding Spark workloads. You can access it with the premium tiers of Dataproc and Serverless Apache Spark | |
|---|---|---|
| Product | Availabilty | Access |
Google Cloud Serverless for Apache Spark - Premium tier | Generally available | |
Dataproc on Google Compute Engine | In preview | Coming soon |
Availability
Lightning Engine is for your most demanding Spark workloads. You can access it with the premium tiers of Dataproc and Serverless Apache Spark
How It Works
Common Uses
Accelerate the feature engineering and data preparation steps that are critical for your machine learning lifecycle. By speeding up the most time-consuming part of the ML workflow, your data scientists can run more experiments, iterate on models faster, and get valuable AI applications into production sooner.
Accelerate the feature engineering and data preparation steps that are critical for your machine learning lifecycle. By speeding up the most time-consuming part of the ML workflow, your data scientists can run more experiments, iterate on models faster, and get valuable AI applications into production sooner.
Pricing
| Accelerated Spark, your way | Lightning Engine is a feature of the premium tiers of Dataproc and Google Cloud Serverless for Apache Spark. |
|---|---|
| Product | Pricing |
In preview, coming soon. |
Accelerated Spark, your way
Lightning Engine is a feature of the premium tiers of Dataproc and Google Cloud Serverless for Apache Spark.