Analytics Lakehouse

Bring analytics to your data wherever it resides

The analytics lakehouse combines the key features of a data lake and a data warehouse into a single platform for storing, processing, and analyzing both structured and unstructured data.
New customers get $300 in free credits to fully explore and conduct an assessment of Google Cloud.
Who this is for
Data Engineers, Data Analysts, Data Scientists
What you'll learn
How to build an analytics lakehouse that can be used for analytic queries, machine learning, and visualization.
How you'll deploy
Once you've signed up for Google Cloud, you can deploy through the console.
Overview

What is the analytics lakehouse?

The analytics lakehouse combines the key features of a data lake and a data warehouse into a single platform for storing, processing, and analyzing both structured and unstructured data.

What does the analytics lakehouse architecture enable?

The analytics lakehouse enables organizations to extract data in real-time regardless of which cloud or datastore the data resides and use it in aggregate for greater insight and artificial intelligence (AI), with governance and unified access across teams.

What are the benefits of the analytics lakehouse for data analysts, data engineers, and data scientists?

The analytics lakehouse provides a democratized environment for data analysts (SQL), data engineers (Spark/Beam) and data scientists (ML) with a built-in and managed application stack without needing to move the data around.

Does the analytics lakehouse support integrated open-source tools?

Yes, the analytics lakehouse supports serverless solutions and integrated open-source tools, which provides flexibility for data professionals to bring their tool of choice for analyzing and processing data either in real-time or in batch.

How does the analytics lakehouse support ML workloads?

The analytics lakehouse has an in-Database ML with an inference engine allowing ML models to run where the data is and integrate with MLOps pipelines to productionize ML workloads without moving the data

What data governance features does the analytics lakehouse provide?

Centralized cataloging and fine-grained security for data management and governance enabling organizations to unify datastores and associated metadata to simplify permissions and exploration
Solution Details

Analytics Lakehouse

The analytics lakehouse merges the features of a data lake and a data warehouse into a single platform for storing, processing, and analyzing both structured and unstructured data.

Solution Architecture
  1. Data lands in Google Cloud Storage buckets
  2. A data lake is created in Dataplex. Data in the buckets are organized into entities, or tables, in the data lake.
  3. Tables in the data lake are immediately available in BigQuery as BigLake tables
  4. Data transformations can be performed using Spark or BigQuery, and using open file formats including Apache Iceberg
  5. Data can be secured using policy tags and row access policies
  6. Machine learning can be applied on the tables
  7. Dashboards are created from the data to perform more analytics
Bring analytics to your data wherever it resides
Google Cloud Experience Level
Beginner
Estimated deployment time
12 min
2 min to configure, 10 min to deploy
New customers get $300 in free credits to fully explore and conduct an assessment of Google Cloud.
Requirements
  • Active Google Cloud account
  • Administrator rights to your project
Google Cloud
  • ‪English‬
  • ‪Deutsch‬
  • ‪Español‬
  • ‪Español (Latinoamérica)‬
  • ‪Français‬
  • ‪Indonesia‬
  • ‪Italiano‬
  • ‪Português (Brasil)‬
  • ‪简体中文‬
  • ‪繁體中文‬
  • ‪日本語‬
  • ‪한국어‬
Console
Google Cloud