Enterprise storage, governance, and performance to build scalable analytical, operational, and real-time AI use cases on a unified, cross-cloud, and multimodal open lakehouse.
Features
Apache Iceberg tables, managed using the Lakehouse Iceberg REST catalog, provide read and write interoperability between BigQuery and Google Cloud Managed Service for Apache Spark as well as Iceberg-compatible OSS engines such as Spark, Trino and Flink, and now with third-party engines like Snowflake and Databricks (Preview). This helps you easily connect your Iceberg tables directly to engines like BigQuery and Google managed Spark so you can accelerate your AI workloads.
Leverage cross-cloud interconnect and caching (Preview) to get fast, low-latency access to S3 Iceberg data. Run BigQuery, Spark, and Gemini Enterprise through conversational analytics API jobs on AWS data with price-performance characteristics comparable to native data platform solutions. Plus, new Lakehouse runtime catalog federation (Preview) seamlessly unites your ecosystem, letting BigQuery and Google managed Spark discover and analyze enterprise data across Snowflake, Databricks, and AWS Glue.
BigQuery’s enhanced vectorized execution is now default for Lakehouse Iceberg REST Catalog tables as well as Iceberg and Parquet tables in BigQuery catalog. Offload routine Iceberg maintenance like compaction, clustering, and garbage collection directly to Google Lakehouse. New automated features—including table management, partitioning, clustering, and history-based optimization (GA for Iceberg tables in BigQuery catalog; Preview for REST catalog)—accelerate price-performance with zero manual overhead.
Power real-time insights with Iceberg using BigQuery streaming for high-throughput ingestion with zero-read latency. Build complex processing pipelines with multi-statement transactions and BigQuery change data replication to Iceberg tables (GA for BigQuery catalog; Preview for REST catalog). Unlock multimodal, vector, and graph analytics by uniting structured and unstructured data using BigQueryObjectRefs. Supercharge Spark data science workloads with Lightning Engine with up to 4.5x faster performance.
Power AI agents with real-time transactional data. Stream operational data from Spanner, AlloyDB, and Cloud SQL into BigQuery and managed Iceberg tables for instant analysis, and push these analytical insights directly back into AlloyDB or Spanner, to serve them with sub-millisecond, high-QPS latency. Get unified governance with lineage, profiling and data quality through the Knowledge Catalog (formerly Dataplex) integration. Map transactional, unstructured and Iceberg data to your business logic, giving your agents the deep context they need to deliver accurate, reliable, and fully governed results.
Modernize to an open, unified lakehouse architecture
Modernize your data foundation with Google’s Lakehouse. Shift legacy Hadoop to serverless Cloud Storage and unify cross-cloud data by querying Iceberg and Delta Lake directly in BigQuery. Lakehouse Iceberg REST catalog eliminates silos, offering an interoperable runtime for Spark, Trino, and Flink. With Hive catalog support, you can easily modernize Hadoop workloads to Iceberg.
Modernize to an open, unified lakehouse architecture
Modernize your data foundation with Google’s Lakehouse. Shift legacy Hadoop to serverless Cloud Storage and unify cross-cloud data by querying Iceberg and Delta Lake directly in BigQuery. Lakehouse Iceberg REST catalog eliminates silos, offering an interoperable runtime for Spark, Trino, and Flink. With Hive catalog support, you can easily modernize Hadoop workloads to Iceberg.
Seamless read/write sharing between BigQuery and OSS engines
Bring your existing Iceberg pipelines and seamlessly read or write to those tables using BigQuery or managed Spark, while easily modernizing with advanced BigQuery capabilities. Supercharge data science by running Spark ETL and BigQuery AI on the exact same Iceberg tables with zero data movement. Build conversational analytics agents in BigQuery that work with your data in S3.
Seamless read/write sharing between BigQuery and OSS engines
Bring your existing Iceberg pipelines and seamlessly read or write to those tables using BigQuery or managed Spark, while easily modernizing with advanced BigQuery capabilities. Supercharge data science by running Spark ETL and BigQuery AI on the exact same Iceberg tables with zero data movement. Build conversational analytics agents in BigQuery that work with your data in S3.
Multimodal data analysis and accelerated AI workflows
Power multimodal analysis with BigQuery AI by combining structured Iceberg tables with unstructured data using BigQuery ObjectRefs for single-SQL inference. Train Gemini Enterprise Agent Platform models using time-travel to debug data drift. Federate global REST catalogs into a unified data mesh, analyze massive-scale logs affordably, and build models directly in integrated notebooks to accelerate your AI workflows.
Multimodal data analysis and accelerated AI workflows
Power multimodal analysis with BigQuery AI by combining structured Iceberg tables with unstructured data using BigQuery ObjectRefs for single-SQL inference. Train Gemini Enterprise Agent Platform models using time-travel to debug data drift. Federate global REST catalogs into a unified data mesh, analyze massive-scale logs affordably, and build models directly in integrated notebooks to accelerate your AI workflows.
Power data science workloads across developer environments
Unlock a frictionless Spark experience. Run SQL, Spark, and Python on a single copy of Iceberg data using unified IDEs. The new Antigravity VS Code extension acts as an AI partner to generate pipelines, debug code, and automate CI/CD from natural language. Plus, our vectorized Lightning Engine accelerates Spark execution up to 4.5x—requiring zero code changes.
Power data science workloads across developer environments
Unlock a frictionless Spark experience. Run SQL, Spark, and Python on a single copy of Iceberg data using unified IDEs. The new Antigravity VS Code extension acts as an AI partner to generate pipelines, debug code, and automate CI/CD from natural language. Plus, our vectorized Lightning Engine accelerates Spark execution up to 4.5x—requiring zero code changes.
Performance optimization with BigQuery
Leverage BigQuery’s scale while maintaining flexible storage. Execute multi-statement transactions in BigQuery to update multiple Iceberg tables as a single atomic unit, ensuring financial-grade consistency. Use BigQuery’s advanced runtime and partitioning support for Iceberg to create partitioned/clustered tables that leverage block pruning for high-speed, cost-effective query execution.
Performance optimization with BigQuery
Leverage BigQuery’s scale while maintaining flexible storage. Execute multi-statement transactions in BigQuery to update multiple Iceberg tables as a single atomic unit, ensuring financial-grade consistency. Use BigQuery’s advanced runtime and partitioning support for Iceberg to create partitioned/clustered tables that leverage block pruning for high-speed, cost-effective query execution.
Combined transactional and analytical for agentic AI
Fuel event-driven AI agents by unifying your transactional and analytical data. Automate continuous CDC replication from Spanner and AlloyDB directly into Lakehouse Iceberg tables. Next, use SQL continuous queries to monitor this streaming data, instantly run AI inference, and trigger downstream actions—delivering real-time intelligence for your most critical operational workloads.
Combined transactional and analytical for agentic AI
Fuel event-driven AI agents by unifying your transactional and analytical data. Automate continuous CDC replication from Spanner and AlloyDB directly into Lakehouse Iceberg tables. Next, use SQL continuous queries to monitor this streaming data, instantly run AI inference, and trigger downstream actions—delivering real-time intelligence for your most critical operational workloads.
Govern your lakehouse with Knowledge Catalog
Knowledge Catalog provides a unified governance layer by automatically discovering Iceberg tables in Cloud Storage and registering their metadata directly into the Lakehouse runtime catalog. This integration allows you to define centralized security policies ensuring consistent row- and column-level access control across both BigQuery and open-source processing engines.
Govern your lakehouse with Knowledge Catalog
Knowledge Catalog provides a unified governance layer by automatically discovering Iceberg tables in Cloud Storage and registering their metadata directly into the Lakehouse runtime catalog. This integration allows you to define centralized security policies ensuring consistent row- and column-level access control across both BigQuery and open-source processing engines.
Pricing
| How Lakehouse (BigLake) pricing works | Lakehouse (BigLake) pricing is based on table management, metadata storage and metadata access | |
|---|---|---|
| Services and usage | Description | Price (USD) |
Lakehouse (BigLake)table management | Lakehouse (BigLake) table management compute resources used for automatic table storage optimization. | Starting at $0.12 per DCU-Hour |
Lakehouse (BigLake) metadata storage | Lakehouse for Apache Iceberg metastore (Lakehouse runtime catalog) charges for metadata stored. Free tier includes 1 GiB of metadata storage per month included. | Starting at $0.04 per GiB per month |
Lakehouse (BigLake) metadata access | Class A operations: Lakehouse (BigLake) metadata access charges for writes, updates, list, create, and config operations with a free tier of 5,000 operations per month included. | Starting at $6.00 per million operations |
Class B operations: Lakehouse (BigLake) metadata access charges for reads, get, and delete operations with a free tier of 50,000 operations per month included. | Starting at $0.90 per million operations | |
How Lakehouse (BigLake) pricing works
Lakehouse (BigLake) pricing is based on table management, metadata storage and metadata access
Lakehouse (BigLake)table management
Lakehouse (BigLake) table management compute resources used for automatic table storage optimization.
Starting at
$0.12
per DCU-Hour
Lakehouse (BigLake) metadata storage
Lakehouse for Apache Iceberg metastore (Lakehouse runtime catalog) charges for metadata stored. Free tier includes 1 GiB of metadata storage per month included.
Starting at
$0.04
per GiB per month
Lakehouse (BigLake) metadata access
Class A operations: Lakehouse (BigLake) metadata access charges for writes, updates, list, create, and config operations with a free tier of 5,000 operations per month included.
Starting at
$6.00
per million operations
Class B operations: Lakehouse (BigLake) metadata access charges for reads, get, and delete operations with a free tier of 50,000 operations per month included.
Starting at
$0.90
per million operations