Introduction to connections
BigQuery lets you query data that's stored outside of BigQuery in Google Cloud services like Cloud Storage or Spanner, or in third-party sources like AWS or Azure. These external connections use the BigQuery Connection API.
For example, suppose that you store details about customer orders in Cloud SQL and data about sales in BigQuery, and you want to join the two tables in a single query. You can create a Cloud SQL connection to the external database by using the BigQuery Connection API. With connections, you never send database credentials as cleartext.
A connection is encrypted and securely stored in the BigQuery connection service. You can give users access to connections by granting them BigQuery connection Identity and Access Management (IAM) roles.
Connection types
BigQuery provides different connection types for the following external data sources:
- Amazon Simple Storage Service (Amazon S3)
- Apache Spark
- Azure Blob Storage
- Google Cloud resources such as Vertex AI remote models, remote functions, and BigLake
- Spanner
- Cloud SQL
- AlloyDB for PostgreSQL
- SAP Datasphere
Amazon S3 connections
To create an Amazon S3 connection with BigQuery Omni, see Connect to Amazon S3.
Once you have an existing Amazon S3 connection, you can do the following:
- Create external tables on Amazon S3
- Query the Amazon S3 data
- Export results to Amazon S3
- Create datasets based on AWS Glue databases.
Spark connections
Stored procedures for Spark let you run stored procedures written in Python using BigQuery. A Spark connection lets you connect to Dataproc Serverless and run the stored procedures for Spark.
To create this connection, see Create connections.
Blob Storage connections
To create a Blob Storage connection with BigQuery Omni, see Connect to Blob Storage.
Once you have an existing Blob Storage connection, you can do the following:
- Create external tables based on Blob Storage
- Query the Blob Storage data
- Export results to Blob Storage
Google Cloud resource connections
A Google Cloud resource connection is a connection to authorize access to other Google Cloud resources such as Vertex AI remote models, remote functions, and BigLake. For details on how to set up a Google Cloud resource connection, see Create and set up a Cloud resource connection.
Once you have an existing Google Cloud resource connection, you can create the following BigQuery objects with it:
- Remote models. For more information, see The CREATE MODEL statement for remote models over LLMs, The CREATE MODEL statement for remote models over Cloud AI services, and The CREATE MODEL statement for remote models over Vertex AI hosted models.
- Remote functions. BigQuery remote functions let you implement functions with any supported languages in Cloud Run functions or Cloud Run. A remote function connection lets you connect with Cloud Run functions or Cloud Run and run these functions. To create a BigQuery remote function connection, see Create a connection.
- BigLake tables. BigLake connections connect BigLake tables to external data sources while retaining fine-grained BigQuery access control and security for both structured and unstructured data in Cloud Storage.
- Object tables. For more information, see Introduction to object tables.
Spanner connections
To create a Spanner connection, see Connect to Spanner.
Once you have an existing Spanner connection, you can run federated queries.
Cloud SQL connections
To create a Cloud SQL connection, see Connect to Cloud SQL.
Once you have an existing Cloud SQL connection, you can run federated queries.
AlloyDB connections
To create an AlloyDB connection, see Connect to AlloyDB for PostgreSQL.
Once you have an existing AlloyDB connection, you can run federated queries.
SAP Datasphere connections
To create an SAP Datasphere connection, see Connect to SAP Datasphere.
Once you have an existing SAP Datasphere connection, you can run federated queries.
Audit logs
BigQuery logs usage and management requests about connections. For more information, see BigQuery audit logs overview.
What's next
- Learn how to manage connections.
- Learn how to analyze object tables by using remote functions.
- Learn how to query stored data:
- Query data stored in Amazon S3.
- Query data stored in Blob Storage.
- Query structured data stored in Cloud Storage.
- Query unstructured data stored in Cloud Storage.
- Query data stored in Spanner.
- Query data stored in Cloud SQL.
- Query data stored in AlloyDB.
- Query data using remote functions.
- Query unstructured data using remote functions.
- Query data using stored procedures for Apache Spark.
- Learn about external tables.