Jump to Content
Databases

Serving data from Iceberg lakehouses fast and fresh with Spanner columnar engine

February 26, 2026
https://storage.googleapis.com/gweb-cloudblog-publish/images/1_GGexgWX.max-2500x2500.jpg
Jagan R. Athreya

Group Product Manager

Girish Baliga

Director of Engineering

Try Gemini 3.1 Pro

Our most intelligent model available yet for complex tasks on Gemini Enterprise and Vertex AI

Try now

The divide between data in operational databases and analytical data lakehouses is disappearing fast. As businesses increasingly adopt zero ETL lakehouse architectures, the challenge shifts from simply storing data in an open data format such as Apache Iceberg to serving it with the low-latency performance and speed that modern applications and AI agents require. Whether it’s a cybersecurity provider like Palo Alto Networks requiring real-time threat detection insights, or a telecommunications giant like Vodafone looking to reimagine data workflows for better customer experiences, organizations need to serve precomputed insights and AI models at scale.

To help, today, we are excited to announce the preview of the Spanner columnar engine, which allows you to serve your Iceberg lakehouse data with the scale and low latency of Google’s Spanner.

Uniting OLTP and analytics: The Spanner columnar engine

Traditionally, organizations had to choose between the high-performance transactional capabilities of an OLTP database and the analytical power of a columnar warehouse. Spanner’s columnar engine ends this trade-off by uniting these two worlds in a single, horizontally scalable system.

The columnar engine uses a specialized storage mechanism designed to accelerate analytical queries by speeding up scans up to 200 times on live operational data. By storing data in a columnar format alongside traditional row-based storage, Spanner can execute complex queries automatically using vectorized execution — processing batches of data at once rather than row-by-row. Most importantly, this performance boost can be isolated from critical transaction workloads, so that customer-facing applications remain responsive while you gain real-time insights from your operational datastore.

New features

Since we first announced Spanner columnar engine, we’ve added several new capabilities to accelerate performance and enhance usability. These include:

  • Vectorized execution: The engine supports faster columnar scans and aggregations using vectorized execution to process data more efficiently.

  • Automatic query handling: Spanner automatically redirects large-scan analytical queries to the columnar representation, speeding up analytical queries without affecting concurrent transactional workloads, allowing for true hybrid processing.

  • On-demand columnar data conversion: In addition to automated columnar data conversion, a new major compaction API helps accelerate the conversion of existing non-columnar data into the columnar format.

Why Iceberg data needs a fast, low-latency serving platform

Apache Iceberg has become the standard for open lakehouse architectures, providing a robust way to manage massive open format datasets in cloud-based storage. However, while lakehouses are excellent for large-scale analytics, they aren’t usually designed for the sub-second, high-concurrency "point lookups" or aggregated serving that live applications require.

This is where Spanner provides a unique value proposition. By moving curated, processed data from your lakehouse into Spanner — a process known as reverse ETL — you transform "cold" analytical data into "hot" operational data. Spanner provides the global consistency and high availability that applications require, making your Iceberg data accessible via low-latency APIs for real-time decision-making and agentic AI features.

Benchmarking Spanner columnar engine

To demonstrate Spanner’s new serving capabilities, we used Clickbench, a leading industry benchmark for analytical databases. Clickbench focuses on the types of queries common in web analytics and real-time dashboards — the exact scenarios where low-latency serving is critical.

Our benchmark results with a single Spanner node showcase the power of the columnar engine:

Clickbench query

Spanner columnar engine speedup

Q01

46.3× 

Q02

32.7×

Q19

46.7×

Q32

58.6×

These results represent the acceleration of real world workloads on Spanner columnar engine and show that Spanner can take complex, scan-heavy queries and return results in milliseconds, making it a great choice for powering real-time dashboards and user-facing features. Spanner is now a high-performance engine capable of delivering complex analytical results at the speeds required by modern digital experiences.

Universal reverse ETL: Serving data from all lakehouses

Spanner is designed to be the serving layer for your entire data ecosystem. Whether your lakehouse lives in BigQuery, Snowflake, Databricks, or Oracle, Spanner offers an integrated pathway for high-speed serving.

Through our latest reverse ETL workflows, you can easily bridge the gap between your analytical and operational worlds:

  • BigQuery: The tight integration between Spanner and BigQuery provides a powerful, bidirectional bridge for managing Iceberg data across both operational and analytical environments. You can perform federated queries on BigLake Iceberg and Spanner tables using BigQuery external datasets for Spanner, allowing for real-time analysis without moving data. When you need to serve curated BigQuery insights at scale, reverse ETL workflows can push data from BigQuery and BigLake Iceberg tables directly into Spanner. Furthermore, you can capture live operational changes in Spanner and stream them into BigQuery and BigLake Iceberg tables using Datastream, ensuring your lakehouse remains synchronized with your transactional data from Spanner for agentic AI and real-time decision-making.

  • Databricks: Using Databricks' Universal Format (UniForm), you can generate Iceberg metadata for your Delta Lake tables automatically. This allows Spanner to ingest your processed Databricks data via BigQuery or Dataflow, so that your "curated" datasets are ready to power applications with minimal engineering overhead. 

  • Snowflake: You can export Iceberg tables to Google Cloud Storage and use BigQuery BigLake as a zero-copy intermediary to push that data directly into Spanner via EXPORT DATA commands. Alternatively, for simpler migrations, you can export Snowflake data as CSVs and use Dataflow templates for high-throughput ingestion into Spanner.

  • Oracle Autonomous AI Lakehouse: Oracle Goldengate 26ai now allows you to replicate your Oracle Autonomous AI Lakehouse data into Spanner to serve insights generated from Oracle’s data ecosystem with Spanner’s scale and consistency.

Get started today

It’s time to stop waiting for your lakehouse queries to finish and start serving your data hot, fresh, and fast with Google Spanner powered by Spanner columnar engine. The Spanner columnar engine is now in preview. You can enable it on your existing Spanner tables today with a simple DDL change.

You can see the performance acceleration of Spanner columnar engine by running the Clickbench queries on Spanner available on Github to try out for yourself.

To help you get started, here are codelabs to build reverse ETL pipelines to Spanner from

Databricks:

and

Snowflake:

Posted in