Developer tools and features

This document provides an overview of some of the features and tools that help you develop solutions with BigQuery.

BigQuery architecture diagram.

Developer features

This section describes some common built-in features for developers using BigQuery.

Load and transform data

BigQuery offers ways to batch load, stream, and generate new data. To choose the best option for your use case, see Introduction to loading data.

Queries

BigQuery is optimized to run analytic queries that are written in GoogleSQL on large datasets. You can schedule, save, and share queries that run on data stored in BigQuery, external data, data stored in other clouds, or public datasets.

Remote functions

You can use remote functions to deploy your functions in Cloud Functions or Cloud Run, and then invoke them directly from your GoogleSQL queries. This approach is especially useful if you need to implement your functions in languages other than SQL or Javascript. You can use remote functions for many things, such as translating text in a table from one language to another, initiating actions such as notifying you when values in a table drop below a threshold, or running batch transformations such as applying a machine learning (ML) model.

Machine learning

You can use BigQuery ML to create and execute ML models using GoogleSQL queries.

SQL stored procedures

You can use SQL stored procedures to call collections of statements from other queries or stored procedures. You can call built-in stored procedures or write your own, which let you define variables and implement control flow.

Semi-structured data

You can use JSON data in GoogleSQL to ingest semi-structured data into BigQuery without providing a schema up front. You can use the field access operator to query the values of fields and array elements directly.

Time travel

You can use time travel to access data stored in BigQuery that has been changed or deleted at any time in the last seven days. This feature lets you restore updated, deleted, or expired tables even if you haven't backed up your data.

Table snapshots

You can use table snapshots when you need to back up your table from a point in time beyond the time travel window. BigQuery only stores bytes that are different between a snapshot and its base table, so a table snapshot typically uses less storage than a full copy of the table.

Table clones

You can use table clones to version tables and test table schema changes. A table clone is a lightweight, writable copy of another table. You are only charged for storage of data in the table clone that differs from the base table, so initially there is no storage cost for a table clone.

External tables

You can query external tables directly from BigQuery, such as data in a different Google Cloud database, files in Cloud Storage, or in a different cloud product altogether. This feature lets you perform ELT workloads with a single query or join BigQuery tables with frequently changing data from another source.

User defined functions

You can write user defined functions (UDFs) in GoogleSQL or JavaScript that can be reused across queries. You can authorize a UDF as a routine, which lets you share query results with specific users or groups without giving those users or groups access to the underlying tables.

BigQuery APIs

BigQuery offers REST and gRPC APIs to programmatically interface with its different types of services. You can authenticate your client's identity to access the APIs by using service accounts or user accounts. The following APIs are available:

For more information about what each API offers, see BigQuery APIs and libraries overview.

BigQuery DataFrames library

BigQuery DataFrames is a Python API that you can use to analyze data and perform machine learning tasks in BigQuery. You can check source code for the API on GitHub.

Get started with BigQuery DataFrames by using the BigQuery DataFrames quickstart.

Client libraries

Client libraries let you access BigQuery APIs directly by using your preferred programming language, including C#, Go, Java, Node.js, PHP, Python, and Ruby. To view these resources for the BigQuery API, select one of the following languages:

C#

  • Quickstart. Follow step-by-step instructions to run a query in BigQuery by using the client library.
  • API reference documentation. View descriptions of the methods and objects that are supported for your language.
  • GitHub source code. View the source code for the BigQuery client library on GitHub.
  • Stack Overflow. Read, ask, and answer questions that are related to the BigQuery client library.

Go

  • Quickstart. Follow step-by-step instructions to run a query in BigQuery by using the client library.
  • API reference documentation. View descriptions of the methods and objects that are supported for your language.
  • GitHub source code. View the source code for the BigQuery client library on GitHub.
  • Stack Overflow. Read, ask, and answer questions that are related to the BigQuery client library.

Java

  • Quickstart. Follow step-by-step instructions to run a query in BigQuery by using the client library.
  • API reference documentation. View descriptions of the methods and objects that are supported for your language.
  • GitHub source code. View the source code for the BigQuery client library on GitHub.
  • Stack Overflow. Read, ask, and answer questions that are related to the BigQuery client library.

Node.js

  • Quickstart. Follow step-by-step instructions to run a query in BigQuery by using the client library.
  • API reference documentation. View descriptions of the methods and objects that are supported for your language.
  • GitHub source code. View the source code for the BigQuery client library on GitHub.
  • Stack Overflow. Read, ask, and answer questions that are related to the BigQuery client library.

PHP

  • Quickstart. Follow step-by-step instructions to run a query in BigQuery by using the client library.
  • API reference documentation. View descriptions of the methods and objects that are supported for your language.
  • GitHub source code. View the source code for the BigQuery client library on GitHub.
  • Stack Overflow. Read, ask, and answer questions that are related to the BigQuery client library.

Python

  • Quickstart. Follow step-by-step instructions to run a query in BigQuery by using the client library.
  • API reference documentation. View descriptions of the methods and objects that are supported for your language.
  • GitHub source code. View the source code for the BigQuery client library on GitHub.
  • Stack Overflow. Read, ask, and answer questions that are related to the BigQuery client library.

Ruby

  • Quickstart. Follow step-by-step instructions to run a query in BigQuery by using the client library.
  • API reference documentation. View descriptions of the methods and objects that are supported for your language.
  • GitHub source code. View the source code for the BigQuery client library on GitHub.
  • Stack Overflow. Read, ask, and answer questions that are related to the BigQuery client library.

For more information about how to use the BigQuery client libraries in your local environment, see BigQuery API client libraries.

Code samples

You can browse BigQuery code samples that provide complete snippets for accomplishing common tasks in BigQuery, such as creating tables, listing connections, viewing capacity commitments and reservations, and loading data. To view a sample, select an API, task, and your preferred language.

Programmatic tools and services

The following services are integrated with BigQuery, offering additional capabilities for building solutions:

  • Dataproc. A fully managed service for running Apache Hadoop and Apache Spark jobs. Dataproc provides the BigQuery connector, which lets Hadoop and Spark directly process data from BigQuery.
  • Dataflow. A fully managed service for running Apache Beam jobs at scale. The BigQuery I/O connector for Beam lets Beam pipelines read and write data to and from BigQuery.
  • Cloud Composer. A fully managed workflow orchestration service built on Apache Airflow. BigQuery operators let Airflow workflows manage datasets and tables, run queries, and validate data.
  • Pub/Sub. An asynchronous and scalable messaging service. Pub/Sub provides BigQuery subscriptions for writing messages to an existing BigQuery table as they are received.

Continuous integration and deployment

The following options help you manage and automate your developer workflow with BigQuery:

  • BigQuery Terraform module. A module to automate the instantiation and deployment of your BigQuery datasets and tables.
  • bq command-line tool. A Python-based command-line tool for BigQuery.
  • Dataform. A service for data analysts to develop, test, version control, and schedule complex SQL workflows for data transformation in BigQuery.
  • dbt. A framework to help you orchestrate and deploy workflows, test and catalog your data, and reuse pieces of code as macros.
  • Liquibase. A database schema change management solution that lets you revise and release changes quickly and safely from development to production.

ODBC and JDBC drivers

Open Database Connectivity (ODBC) and Java Database Connectivity (JDBC) drivers let you write database-neutral software applications in popular programming languages to connect BigQuery to your existing infrastructure. For more information and the latest driver releases, see ODBC and JDBC drivers for BigQuery.

What's next

  • For more information about upcoming events and resources for Google Cloud developers, see the developer center.
  • For more information about how other companies use Google Cloud, see data cloud for ISVs.
  • For more information about validated partner solutions that integrate with BigQuery, see Google Cloud Ready - BigQuery.