BigQuery overview

BigQuery is a fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence. BigQuery's serverless architecture lets you use SQL queries to answer your organization's biggest questions with zero infrastructure management. Federated queries let you read data from external sources while streaming supports continuous data updates. BigQuery's scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes.

BigQuery's architecture consists of two parts: a storage layer that ingests, stores, and optimizes data and a compute layer that provides analytics capabilities. These compute and storage layers efficiently operate independently of each other thanks to Google's petabit-scale network that enables the necessary communication between them.

Legacy databases usually have to share resources for read/write operations and analytical operations. This can result in resource conflicts and can slow queries while data is written to or read from storage. Shared resource pools can become further strained when resources are required for database management tasks such as assigning or revoking permissions. BigQuery's separation of compute and storage layers lets each layer dynamically allocate resources without impacting the performance or availability of the other.

BigQuery architecture separates resources with petabit network.

This separation principle lets BigQuery innovate faster because storage and compute improvements can be deployed independently, without downtime or negative impact on system performance. It is also essential to offering a fully managed serverless data warehouse in which the BigQuery engineering team handles updates and maintenance. The result is that you don't need to provision or manually scale resources, leaving you free to focus on delivering value instead of traditional database management tasks.

BigQuery interfaces include Google Cloud console interface and the BigQuery command-line tool. Developers and data scientists can use client libraries with familiar programming including Python, Java, JavaScript, and Go, as well as BigQuery's REST API and RPC API to transform and manage data. ODBC and JDBC drivers provide interaction with existing applications including third-party tools and utilities.

As a data analyst, data engineer, data warehouse administrator, or data scientist, BigQuery helps you load, process, and analyze data to inform critical business decisions.

Get started with BigQuery

You can start exploring BigQuery in minutes. Take advantage of BigQuery's free usage tier or no-cost sandbox to start loading and querying data.

  1. BigQuery's sandbox: Get started in the BigQuery sandbox, risk-free and at no cost.
  2. Google Cloud console quickstart: Familiarize yourself with the power of the BigQuery Console.
  3. Public datasets: Experience BigQuery's performance by exploring large, real-world data from the Public Datasets Program.

Explore BigQuery

BigQuery's serverless infrastructure lets you focus on your data instead of resource management. BigQuery combines a cloud-based data warehouse and powerful analytic tools.

BigQuery storage

BigQuery stores data using a columnar storage format that is optimized for analytical queries. BigQuery presents data in tables, rows, and columns and provides full support for database transaction semantics (ACID). BigQuery storage is automatically replicated across multiple locations to provide high availability.

For more information, see Overview of BigQuery storage.

BigQuery analytics

Descriptive and prescriptive analysis uses include business intelligence, ad hoc analysis, geospatial analytics, and machine learning. You can query data stored in BigQuery or run queries on data where it lives using external tables or federated queries including Cloud Storage, Bigtable, Spanner, or Google Sheets stored in Google Drive.

For more information, see Overview of BigQuery analytics.

BigQuery administration

BigQuery provides centralized management of data and compute resources while Identity and Access Management (IAM) helps you secure those resources with the access model that's used throughout Google Cloud. Google Cloud security best practices provide a solid yet flexible approach that can include traditional perimeter security or more complex and granular defense-in-depth approach.

  • Intro to data security and governance helps you understand data governance, and what controls you might need to secure BigQuery resources.
  • Jobs are actions that BigQuery runs on your behalf to load, export, query, or copy data.
  • Reservations let you switch between on-demand pricing and capacity-based pricing.

For more information, see Introduction to BigQuery administration.

BigQuery resources

Explore BigQuery resources:

APIs, tools, and references

Reference materials for BigQuery developers and analysts:

BigQuery roles and resources

BigQuery addresses the needs of data professionals across the following roles and responsibilities.

Data Analyst

Task guidance to help if you need to do the following:

To take a tour of BigQuery's data analytics features directly in the Google Cloud console, click Take the tour.

Take the tour

Data Administrator

Task guidance to help if you need to do the following:

For more information, see Introduction to BigQuery administration.

To take a tour of BigQuery data administration features directly in the Google Cloud console, click Take the tour.

Take the tour

Data Scientist

Task guidance to help if you need to use BigQuery ML's machine learning to do the following:

Data Developer

Task guidance to help if you need to do the following:

BigQuery video tutorials

The following series of video tutorials get you started with BigQuery:

Title

Description

How to get started with BigQuery (17:18) An overview that summarizes what is BigQuery and how to use it. Segments include: ETL pipelines, pricing and optimization, BigQuery ML and BI Engine, and wrapping up with a demo of BigQuery in Google Cloud console.
What is BigQuery? (4:39) An overview of BigQuery of how BigQuery is designed to ingest and store large amounts of data to help analysts and developers alike
Using the BigQuery sandbox (3:05) How to set up a BigQuery sandbox, letting you run queries without needing a credit card
Asking questions, running queries (5:11) How to write and run SQL queries in the BigQuery UI - plus picking a winning jersey number
Loading data into BigQuery (5:31) How to ingest and analyze data in real time, or just a one-time batch analysis of data - plus cats v. dogs
Visualizing query results (5:38) How data visualization is useful for making complex datasets easier to understand and internalize
Managing access with IAM (5:23) How to allow other users to query your datasets in BigQuery with IAM permissions and access control
Saving and sharing queries (6:17) How to save and share your queries in BigQuery hassle-free
Protecting sensitive data with authorized views (7:12) How to easily share datasets with different users by setting customized access controls
Querying external data with BigQuery (5:49) How to set up an external data source in BigQuery and query data from Cloud Storage, Cloud SQL, Google Drive, and more
What are user-defined functions? (4:59) How to create user-defined functions (UDFs) for analyzing datasets in BigQuery

What's next